\chapter{Methods}
\begin{quote}
In theory, theory and practice are the same. In practice, they are
not.\\
--Yogi Berra
\end{quote}

\section{Introduction}
This chapter presents the theoretical and experimental methods
used throughout this thesis.  The first part deals with data
analysis techniques for characterizing quantum states of light
including the theory of quantum state tomography.  The second section
discusses laboratory methods for creating, manipulating and
measuring those states of light.

\section{Quantum state tomography}
One of the primary tasks for an experimentalist
is to correctly determine the state of a system under control
in the laboratory.  Quantum systems are no exception in this regard,
but it is only quite recently that
experimentalists have had sufficiently well-behaved, well-isolated systems and the
necessary measurement tools to unambiguously determine the quantum state
of a system directly from experimental data\cite{James2001}. The
procedure of completely estimating a quantum state from a set of experimental
measurements is called quantum state tomography (QST).  QST has
been performed on such varied single-particle quantum systems as the electronic state
of hydrogen\cite{Ashburn1990}, the vibrational state of an ensemble of
molecules, the motional state of a trapped ion\cite{Leibfried1996,Poya1996},
the internal angular momentum state of an ensemble of cesium
atoms\cite{Klose2001} and the state of nuclear spins of a molecule\cite{Chuang1998_2}.  Continuous variable  quantum state tomography
has long been used to measure the Wigner
function of various states of a single optical
mode\cite{Smithey1993} and has more recently been
applied to measuring the Wigner functions of helium atoms in a double-slit
experiment\cite{Kurt1997} and the vibrational state of atoms in an
optical lattice\cite{Drobny2002}.  In optics, quantum state tomography can be
traced back to the work of G.S. Stokes\cite{Stokes1852} who developed a minimal set
of measurements to describe the polarization of light.     The first tomography of an entangled quantum
state was performed on the polarization state of two photons generated
by spontaneous parametric downconversion\cite{James2001}.  Since then,
entangled states have been measured in the same way in trapped
ions\cite{Haffner2004}, superconducting qubits\cite{Steffen2006},
quantum-dot light sources\cite{Stevenson2006} and other systems.  It
is no exaggeration to say that quantum state tomography is now
considered necessary for verifying claims that a particular quantum
state has been generated in the lab.  The major aims of this thesis
are to extend quantum state tomography to systems of indistinguishable
particles as will be done in Chapter 3, and to try to improve upon
quantum state tomography as a measurement tool by selecting optimal
sets of measurements as in Chapter 4, or by finding better methods of
obtaining information about certain aspects of the quantum state as in
Chapter 5.  The present chapter will
introduce the tools that are now routinely used to determine the quantum
state of an experimental system and, in particular, describe how the
quantum state was measured in the experiments described in subsequent chapters.

In attempting to measure the state of a system under his control the
experimentalist is confronted by a fundamental problem --
any measurement that he does will disturb the system in an
uncontrolled way, thereby destroying some information about the state.
The measurement postulate tells us that any precise
measurement of an unknown observable will put the system into a different
state, altering, in the process, information about all observables that do not
commute with the observable being measured.  This makes it impossible
to characterize the quantum state of a single copy of a quantum system.  

Since determining the quantum state of a single system is impossible,
the next best thing from an experimental point of view is to measure
the state of a set of identically prepared systems.  This can either
be done by repeating the same preparation steps multiple times on a
single system as is done, for instance, in ion trap quantum systems\cite{Haffner2004} or
in developing a source that produces many systems all in the same
state as is usually done in experiments with photons\cite{James2001}.  An additional
problem now arises, however.  The cautious experimentalist is not
certain that the same preparation steps will result in the system
being put in the same state unless he is able to measure that the
system is in the same state after multiple preparations.  Similarly,
he needs a way of experimentally verifying that a source of
a large number quantum systems successfully puts them all in the
same state.  Since he cannot measure the
state of a single system this seems like a difficult business.
Moreover, since the measurements being made are quantum mechanical, one
expects to obtain different outcomes for the same measurement from one repetition to the next due to quantum
uncertainty even if the systems are identically prepared.  How
can the randomness due to experimental error be distinguished from the
inherent randomness that one expects for quantum measurements?  This
conundrum can be resolved by describing the system in terms of the
density matrix introduced in the last chapter.  

Imagine a source that produces some set of states $\left\{\ket{\psi_i}\right\}$ with probabilities
$\left\{p_i\right\}$.  The inclusion of classical probabilities can take into account any random
fluctuation in the state preparation due to experimental
imperfections.  Define the density matrix $\rho$ as 
\begin{align}
\rho=\sum_{i} p_i \ket{\psi_i}\bra{\psi_i} 
\end{align}
The density matrix can be seen either as a description of the source
or of the ensemble of systems produced by that source.  From this
point of view, the state
$\ket{\psi_i}$ is produced with relative frequency $p_i$.
Alternatively, $\rho$ can be viewed as the state of a particular
system produced by the source in which case the $\left\{p_i\right\}$
represent our uncertainty about which state was produced.  As a
classical probability distribution over quantum states,
the density matrix takes account of both the statistical randomness of
the source and the fundamental quantum randomness of the measurements.

\ignore{
The density matrix description includes the Dirac ket
description of the state as the special case where there is only one
possible state and $p_1=1$.  It has some clear advantages over the ket
notation, however.  First, $\rho$ is a Hermitian operator and
therefore a quantum-mechanical observable.  This means that in principle
it is measurable, making it the natural description for the
experimentalist.   Second, it allows for a consistent description of
partial systems obtained by considering only one particle in a
multi-particle system, say.  Finally it contains as a special
case classical probability distributions and is therefore vital to
making two connections, one between quantum mechanics and
thermodynamics and another between quantum mechanics and Shannon's
theory of information.
}
\subsection{Properties of the density matrix}
We formally restate the properties of the density matrix mentioned in
chapter 1.  Detailed proofs of
these properties can be found in Nielsen and
Chuang\cite{MikeandIke}.

\begin{itemize}
\label{DensityMatrixProperties}
\item {\bf Hermiticity:} The density matrix is a hermitian operator
\item {\bf Positive semi-definiteness:}  All of the eigenvalues of the
  density matrix are positive (they are, in fact, just the
  probabilities $\left\{p_i\right\}$).  Equivalently $\bra{\phi} \rho
  \ket{\phi}\geq 0$ for any state $\ket{\phi}$.
\item {\bf Trace condition:} $\rm{Tr}\rho=1$.  This follows from the fact
  that $\sum{p_i}=1$.  
\item {\bf Measurement:} For a system in state $\rho$, the expectation
  value of a measurement described by Hermitian operator ${\bf
    \hat{A}}$ is $\text{Tr} \rho {\bf \hat{A}}$.
\end{itemize}


\subsection{Measuring the density matrix}
Since the density matrix offers a complete description of the quantum
state created by a laboratory process (over many repetitions), the
most general state characterization is a measurement of the density
matrix.  For a general unknown state this cannot be done in a single
shot, but rather requires non-commuting measurements to be done in different
bases.  The measurement process can be thought of as a set of rotations
of the state into different bases, followed by projections
onto orthogonal rays in the Hilbert space of states.  This
procedure involving rotations followed by snapshot measurements bears
some resemblance to classical imaging techniques such as Computer
Assisted Tomography (CAT) scans where a three-dimensional image is
constructed out of many two-dimensional x-ray absorption images taken through
different planes.  

One assumes in QST that nothing is known about the state to be
measured \emph{a priori}.  The measurements taken
should be able to reproduce the density matrix regardless of the input
state.  

We assume that the experimentalist has available to him some set of
projective measurements (PVMs)\footnote{More generally one could look at
  Positive Operator Valued Measurements (POVMs), but PVMs are
  sufficient and far more common in real experiments.  The analysis is
essentially the same.} denoted
${\left\{P^{(i)}\right\}}$.  He may perform each of these measurements on an
  arbitrarily large number of individual systems from the source in
  order to determine $\left<P^{(i)}\right>$ with arbitrary precision.
  He wishes to determine the density matrix in some particular basis
  $\left\{\ket{\phi_q}\right\}$.  Since 
\begin{align}
\rho&=\sum_i p_i \ket{\psi_i}\bra{\psi_i}\notag\\
&=\sum_i p_i \left(\sum_q c_{q,i}\ket{\phi_q}\right)\left(\sum_q
c_{q',i}^*\bra{\phi_{q'}}\right)\notag \\
&=\sum_{i,q,q'} p_i c_{q,i}c^*_{q',i} \ket{\phi_q}\bra{\phi_{q'}},
\label{rho_in_phi_basis}
\end{align}
this amounts to measuring each of the complex numbers $\sum_i p_i c_{q,i}c^*_{q',i}$.
Writing the density matrix in this way highlights the impossibility of
knowing whether variation in measurement outcomes arises due to experimental errors or
fundamental quantum mechanical uncertainty.  The classical
probabilities associated with experimental errors $p_i$ and the quantum mechanical
amplitudes $c_q$ always occur together as a product so that the
contribution due to each cannot be determined for any particular
measurement.
\subsubsection{Linear inversion}
The simplest and most intuitive approach to quantum state
estimation is to regard the mapping of experimental measurements
to a density matrix as a linear inversion problem.
Linear inversion is possible because Hermitian operators form a
Hilbert space, and so the density matrix will be related to a
complete set of measurement expectation values through a linear
map (i.e. a matrix) that is a function only of the particular
measurement operators that were implemented.  The major downside
to this approach is that it does not allow one to easily take
account of the positivity constraint on the density matrix.
Linear inversion is still important because it is the only available analytic
inversion tool which makes it important for conceptual
understanding of state estimation and as a means of calculating
error propagation without resorting to Monte Carlo techniques.  

Consider the expectation value of the measurement $P^{(a)}$  
\begin{align}
\label{measurement_outcome_in_phi_basis}
\left<P^{(a)}\right>&=\text{Tr}\left\{\rho P^{(a)} \right\}\notag \\
&=\sum_{m} \bra{\phi_m}\rho P^{(a)}\ket{\phi_m}\\
&=\sum_{m} \bra{\phi_m}\sum_i p_i \ket{\psi_i}\bra{\psi_i} P^{(a)}\ket{\phi_m}\\
&=\sum_{m,i} p_i \braket{\phi_m|\psi_i} \bra{\psi_i} P^{(a)}\ket{\phi_m}\\
&=\sum_{m,i} p_i c_{m,i} \left(\sum_{q'}c^*_{q',i}\bra{\phi_{q'}}\right) P^{(a)}\ket{\phi_m}\\
&=\sum_{m,q'} \left(\sum_i p_i c_{m,i} c^*_{q',i}\right) \bra{\phi_{q'}}P^{(a)}\ket{\phi_m}.
\end{align}
We note that since the basis vectors $\ket{\phi_m}$ and the
measurement operator $P^{(a)}$ depend only on the measurement setup,
not the measurement outcomes, $\bra{\phi_{q'}}P^{(a)}\ket{\phi_m}$ are
fixed for a given quantum state tomography setup, independent of the
input state.  The quantities $\sum_i p_i
c_{m,i} c^*_{q',i}$, on the other hand, depend on the density matrix
$\rho$.  In
fact, as was seen in equation \ref{rho_in_phi_basis}, they uniquely
define it.  Once they are extracted from measurement outcomes
$\rho$ can be reconstructed. 

If many expectation values $\left<P^{(a)}\right>$ are measured then many
equations of the form of (\ref{measurement_outcome_in_phi_basis}) are
obtained.  These can be arranged in a matrix form as 

\tiny
\begin{align}
 \left( \begin{array}{c}
\left<P^{(1)}\right>  \\
\vdots  \\
\left<P^{(n)}\right>  \end{array} \right)=
 \left( \begin{array}{cccccc}
\bra{\phi_1}P^{(1)}\ket{\phi_1} & \cdots &
\bra{\phi_d}P^{(1)}\ket{\phi_1} & \bra{\phi_1}P^{(1)}\ket{\phi_2} &
\cdots & \bra{\phi_d}P^{(1)}\ket{\phi_d}  \\
\vdots & & & & & \vdots \\
\bra{\phi_1}P^{(n)}\ket{\phi_1} & \cdots &
\bra{\phi_d}P^{(n)}\ket{\phi_1} & \bra{\phi_1}P^{(n)}\ket{\phi_2} &
\cdots & \bra{\phi_d}P^{(n)}\ket{\phi_d}  \\ \end{array} \right) \left( \begin{array}{c}
\sum_i p_i
c_{1,i} c^*_{1,i}  \\
\vdots  \\
\sum_i p_i
c_{d,i} c^*_{d,i}  \end{array} \right)
\label{matrix_expectation_values_as_a_function_of_density_matrix}
\end{align}
\normalsize

More compactly, we can define a matrix
\begin{align}
{\bf M}= \left( \begin{array}{cccccc}
\bra{\phi_1}P^{(1)}\ket{\phi_1} & \cdots &
\bra{\phi_d}P^{(1)}\ket{\phi_1} & \bra{\phi_1}P^{(1)}\ket{\phi_2} &
\cdots & \bra{\phi_d}P^{(1)}\ket{\phi_d}  \\
\vdots & & & & & \vdots \\
\bra{\phi_1}P^{(n)}\ket{\phi_1} & \cdots &
\bra{\phi_d}P^{(n)}\ket{\phi_1} & \bra{\phi_1}P^{(n)}\ket{\phi_2} &
\cdots & \bra{\phi_d}P^{(n)}\ket{\phi_d}  \\ \end{array} \right)
\end{align} 
and vectors
\begin{align}
\vec{P}= \left( \begin{array}{c}
\left<P^{(1)}\right>  \\
\vdots  \\
\left<P^{(n)}\right>  \end{array} \right),\vec{\rho}=\left( \begin{array}{c}
\sum_i p_i
c_{1,i} c^*_{1,i}  \\
\vdots  \\
\sum_i p_i
c_{d,i} c^*_{d,i}  \end{array} \right)
\end{align}
Then equation
(\ref{matrix_expectation_values_as_a_function_of_density_matrix})
becomes
\begin{align}
\vec{P}={\bf{M}}\vec{\rho}
\label{PMrho}
\end{align}
Clearly if ${\bf M}$ has an inverse ${\bf M}^{-1}$ then one can write
the density matrix elements $\vec{\rho}$ in terms of the measurements
$\vec{P}$ as 
\begin{align}
\vec{\rho}={\bf M}^{-1} \vec{P}
\end{align} 
In order for ${\bf M}$ to have an inverse, it must be
a square matrix so that the number of rows (i.e. the number of
measurements) is the same as the number of columns (i.e. the number of
density matrix elements, $d^2$).  

If the number of linearly-independent measurements is less than the number of density matrix
elements (i.e. if the $\text{rank}{\bf M}<d^2$),then ${\bf M}$ is not
invertible and the inversion problem is under-determined.

When the number of measurements
exceeds the number of density matrix elements, the problem is
over-determined.  Linear inversion is still possible by using
techniques from linear regression theory\cite{DataAnalysisBriefBook}.  Imagine that
there are $m$ measurements so that $\vec{P}$ is of length $m$ and ${\bf
  M}$ is an $m\times d^2$ dimensional matrix.  We
can multiply both sides of equation \ref{PMrho} by ${\bf M}^\dagger$ to obtain 
\begin{equation}
{\bf M}^\dagger \vec{P}={\bf M}^\dagger {\bf M}\vec{\rho}
\end{equation} 
Since ${\bf M}$ is $m \times d^2$, ${\bf M}^\dagger$ is $d^2 \times
m$, so ${\bf M}^\dagger \vec{P}$ is a $d^2$ dimensional vector and ${\bf
  M}^\dagger {\bf M}$ is a $d^2 \times d^2$ square matrix of rank $d^2$ whose
inverse, $\left({\bf
  M}^\dagger {\bf M}\right)^{-1}$ allows us to solve the equation for $\vec{\rho}$
\begin{equation}
\left({\bf
  M}^\dagger {\bf M}\right)^{-1}{\bf M}^\dagger \vec{P}=\vec{\rho}
\label{eq:least_squares}
\end{equation} 
$\vec{\rho}$ represents the density matrix that minimizes
the sum of the variances between its predicted expectation values and
the measured data\cite{DataAnalysisBriefBook}.  It is, in other words, the least-squares estimator
of the true density matrix.

Unfortunately, because no positivity constraint was imposed, it
is possible that the estimated $\rho$ will have negative
eigenvalues.  This makes it impossible to calculate quantities
from $\rho$ that depend on its positivity, including
the entanglement-related properties like concurrence and the
fidelity with other states.  This fact usually makes it
necessary to employ a more sophisticated methodology like
maximum-likelihood fitting to be discussed in the next section.  It should be noted, though, that if
the density matrix obtained from linear fitting is positive,
then it will also be the maximum-likelihood estimator for the
least-squares likelihood function and is usually computationally much
easier to obtain.

\subsubsection{Error analysis}
Because the linear inversion is analytic, it can give
insight into how errors propagate from experimentally collected counts
to the density matrix.  We will be making use of the error analysis
derived here in Chapter 4.

In photon counting experiments, non-systematic
errors
arise largely due to the counting statistics or shot noise.  For the
intensities encountered in the experiments in this thesis, the
probability of obtaining more than one photon pair in a single-photon
coherence time is vanishingly small\footnote{Although in
  high-intensity pulsed SPDC experiments such as those described in
  \cite{deMartini2001,Eisenberg2004}, bosonic stimulation can make for
  highly non-Poissonian counting statistics.} As a consequence, the
probability of creating a photon
pair in an infinitesimal time $\text{d}t$ is essentially constant,
giving rise to
Poissonian counting statistics over finite times.  This can be seen
from experimental evidence in Figure
\ref{fig:Poissonian_statistics}.  

\begin{figure}[!t]
  \centerline{
    \mbox{\includegraphics[width=\columnwidth]{Figures/poisson_histogram.eps}}
  }
  \caption{Measured rate of coincident photon events in a 200 ms
    window from Type-I SPDC source.  The rates are very well
    fit by a Poissonian distribution centered at 26.21.}
  \label{fig:Poissonian_statistics}
  \end{figure}

Expectation values $\left<\hat{P}^{(i)}\right>$ are estimated from
counts by dividing the number of counts for a given PVM element by the
total number for
a complete PVM, or \emph{basis}, over a given counting period.  For a two-photon
polarization state
we can write
\begin{equation}
P_{kq} \equiv \left<\hat{P}^{(i)}_{kq}\right>_{\text{est}}=\frac{n_{kq}}{\sum_{i=1}^4
  n_{iq}},
\end{equation}
where $n_{kq}$ is the number of observed counts for the
$k^{\text{th}}$ PVM element in the $q^{\text{th}}$ basis.  In order to
calculate the error $\delta
{P}^{(i)}_{kq}$
in $\left<\hat{P}^{(i)}_{kq}\right>_{\text{est}}$, we first calculate
the dependence of the expectation value estimates on the counts by
evaluating the partial derivatives 
\begin{align}
\left(\frac{\partial P_{kq}}{\partial
    n_{kq}}\right)&=\frac{\sum_{i\neq k}n_{iq}}{\left(\sum_{j=1}^4 n_{jq}\right)^2}\\
\left(\frac{\partial P_{kq}}{\partial
    n_{mq}}\right)&=\frac{-n_{kq}}{\left(\sum_{j=1}^4
    n_{jq}\right)^2} \quad m \neq k\\
\left(\frac{\partial P_{kq}}{\partial
    n_{mt}}\right)&=0 \quad  t \neq q.
\end{align}
The covariance between $P_{kq}$ and $P_{ab}$ is given by
\begin{align}
\delta P_{kq} \delta P_{ab}&=\sum_{x,y} \left(\frac{\partial P_{kq}}{\partial
    n_{x}}\right)\left(\frac{\partial P_{ab}}{\partial
    n_{y}}\right) \delta n_{x}\delta n_{y}\\
&=\sum_{y} \left(\frac{\partial P_{kq}}{\partial
    n_{y}}\right)\left(\frac{\partial P_{ab}}{\partial
    n_{y}}\right) \delta n_{y}^2\\
&=\sum_{y} \left(\frac{\partial P_{kq}}{\partial
    n_{y}}\right)\left(\frac{\partial P_{ab}}{\partial
    n_{y}}\right) n_{y},
\end{align}
where in the second line we have used the fact that the different
photon counts are statistically uncorrelated and in the last line we
have replaced the variance with the estimated mean since the photon counting
statistics are Poissonian.

The errors in the density matrix can then be calculated from
this covariance.  Care must be taken to separate the variance in the real and
imaginary parts of $\rho$.
\begin{align}
\left(\Delta \Re \rho_{ij}\right)^2&= \sum_{kq,ab}\Re \left[\frac{\partial
  \rho_{ij}}{\partial P_{kq}}\right]\Re\left[ \frac{\partial
  \rho_{ij}}{\partial P_{ab}}\right]\delta P_{kq} \delta
  P_{ab}\\
&=\sum_{kq,ab}\Re\left[\left(\left(\mathbf{M}^\dagger\mathbf{M}\right)^{-1}\mathbf{M}^\dagger
  \right)_{ij,kq}\right]\Re\left[\left(\left(\mathbf{M}^\dagger\mathbf{M}\right)^{-1}\mathbf{M}^\dagger
  \right)_{ij,ab}\right]\delta P_{kq} \delta
  P_{ab}\\
\left(\Delta \Im \rho_{ij}\right)^2&= \sum_{kq,ab}\Im \left[\frac{\partial
  \rho_{ij}}{\partial P_{kq}}\right]\Im\left[ \frac{\partial
  \rho_{ij}}{\partial P_{ab}}\right]\delta P_{kq} \delta
  P_{ab}\\
&=\sum_{kq,ab}\Im\left[{\left(\left(\mathbf{M}^\dagger\mathbf{M}\right)^{-1}\mathbf{M}^\dagger
    \right)}_{ij,kq}\right]\Im\left[\left(\left(\mathbf{M}^\dagger\mathbf{M}\right)^{-1}\mathbf{M}^\dagger
  \right)_{ij,ab}\right]\delta P_{kq} \delta
  P_{ab}.
\end{align}

A similar analysis may be carried out for systematic errors such as
waveplate calibration errors, although for datasets containing less
than $10,000$ counts per basis, the random errors were found to be
dominant.
 
\subsection{The positivity constraint}
Linear inversion will sometimes result in unphysical density matrix
estimates for two reasons.  First it is impossible to exactly measure an
expectation value in a finite number of measurements.  While one can
obtain an arbitrary precision by measuring for an arbitrarily long
time, the error in the estimation of the expectation value will be of
order $1/\sqrt{N}$ for $N$ measurements done on uncorrelated copies of
the system.  Second, a systematic error in the measurement
apparatus such as a misalignment of waveplate axes can result in the
measurements that could not have been produced by any quantum state,
even after an arbitrarily long measurement.

These errors, though small, can result in the linear density matrix
reconstruction described in the previous section giving a negative
density matrix.  This problem is especially pronounced when the true
state of the system under study is very pure.  In that case all
but one of the eigenvalues of the density matrix are already zero or close to it, and
a small error in one of the measurements is enough to make some
of the estimated eigenvalues negative.  

For some applications this is not a significant problem.  The
magnitude of the negative eigenvalues will be small relative to one if
the errors are also small, in which case this negativity might be
negligible.  There are, however, many calculations that depend on the
strict positive-semi-definiteness of the density matrix and that give
nonsensical results when a non-positive-semidefinite matrix is used.
Consequently it is often useful to turn to a more sophisticated
analysis which is guaranteed to generate a positive-semidefinite density
matrix from any experimental data.  
\subsection{Maximum-likelihood estimation}
In statistics the problem of matching the parameters of a
physical model to noisy data is well studied and is
addressed by a technique known as maximum likelihood
estimation\cite{DataAnalysisBriefBook}.  This
approach takes the view that given a dataset, an error model and a
parameterized model of the system, one can calculate the probability
that a given set of parameters in the model generated a particular
datum in the data set.  A likelihood function can be defined that
represents the probability that the entire data set was generated from
the model for a given set of parameters.  A numerical search can then
be used to find the set of parameters that maximize this likelihood
function.  This technique has been used successfully in
quantum state tomography to determine the density matrix that was most
likely to have produced a particular set of measurement outcomes\cite{James2001}.

Early work on state tomography relied on an explicit parameterization
of a finite-dimensional density matrix based on a Cholesky decomposition\cite{DataAnalysisBriefBook,James2001}.  The
Cholesky decomposition is akin to a square root operation for
matrices.  Any positive semi-definite hermitian matrix ${\bf A}$ can be
decomposed as ${\bf A}=\tau \tau^{\dagger}$ where $\tau$ is a lower
triangular matrix with real-valued diagonals.  The trace condition for
density matrices can be satisfied by explicitly normalizing to express
$\rho$ as $\rho=\tau \tau^{\dagger}/{\rm Tr}\left\{\tau
\tau^{\dagger}\right\}$.  Parameterizing $\tau$ instead of
parameterizing $\rho$ reduces a constrained multi-dimensional
search problem to an unconstrained one which can be solved
much more efficiently.    

This method has a few problems, however.  First, the Cholesky
decomposition is not unique for positive semi-definite matrices (it is
unique for strictly positive matrices), meaning that if one or more
eigenvalues of $\rho$ is zero there will generally be multiple maxima
of the likelihood function which could lead to convergence problems in
finding the global maximum.  Moreover, there is no guarantee that all
the local maxima will correspond to the same density matrix.  In fact,
maximization starting from a randomly generated set of parameters will
generally result in different final density matrices.  In
practice the maximization routine is usually seeded with a density
matrix obtained
by linear inversion and will converge properly for any
well-behaved likelihood function, but it remains a problem that in
principle this method does not guarantee convergence to the
global maximum\cite{Kosut2004}\footnote{Based on discussions at
  this summer's Workshop on Quantum State Estimation it appears
  that Joe Altepeter and Paul Kwiat have developed techniques
  for guaranteeing global convergence using a Cholesky
  decomposition method.  The comments on the problems with local
  minima are based on my own experiences in following the
  prescription of James et al.\cite{James2001}.}  

More recently many groups working on quantum state tomography have
begun solving the maximum-likelihood problem using the techniques of
\emph{convex optimization}.  This is a class of optimization methods that
efficiently reduce and solve optimization problems in simply connected
convex spaces.  It was pointed out by Kosut\cite{Kosut2004} that the
positivity and unit-trace constraints on density matrices make
maximum-likelihood fitting of a density matrix a convex problem to
which convex optimization techniques can be applied.  In particular,
the Matlab toolkit SeDumi\cite{Sturm1999} handles these problems quite
well and was used for much of the maximum-likelihood analysis done in
this thesis. 

\subsubsection{The likelihood function}
The objective of maximum-likelihood fitting is to find the
density matrix that maximizes the likelihood of having generated
the measured data.  We can write the conditional probability of
obtaining data set $D$ given that the system is described by the
density matrix $\rho$ as 
\begin{equation}
\text{Prob}\left\{D,\rho\right\}=\prod_{\alpha,\gamma}\left(p_{\alpha,\gamma}\left(\rho\right)^{n_{\alpha,\gamma}}\right),
\label{eq:likelihood}
\end{equation}
where $\alpha$ and $\gamma$ respectively index the outcome and
the measurement basis for different waveplate settings.
$n_{\alpha,\gamma}$ is the measured number of occurrences of a particular
outcome and $p_{\alpha,\gamma}\left(\rho\right)$ is the
probability of occurrence for that outcome when the system is
described by density matrix $\rho$.  The product is the
probability of obtaining exactly the data set $D$ consisting of
the counts $n_{\alpha,\gamma}$ if the state were described by
$\rho$.  The maximum-likelihood estimator of the true density
matrix is the density matrix that maximizes this probability.
In practice, it is common to take the logarithm of both sides of
the equation
\begin{align}
\log \text{Prob}\left\{D,\rho \right\} &= \sum_{\alpha,\gamma}n_{\alpha,\gamma}\log
  p_{\alpha,\gamma}\left(\rho\right)\\
&=\sum_{\alpha,\gamma}n_{\alpha,\gamma}\log
  \text{Tr}\left\{\rho \hat{O}_{\alpha,\gamma} \right\},
\label{eq:log-likelihood}
\end{align}
where the $\left\{\hat{O}_{\alpha,\gamma}\right\}$ are the
projective operators measured in the experiment.  Since the
probabilities are positive numbers between zero and one, this
sum will be negative.  Because the logarithm is a monotonic
function, the density matrix that maximizes the
likelihood of equation \ref{eq:likelihood} will also maximize
the log-likelihood of equation \ref{eq:log-likelihood}.
Numerical solvers often require a minimization rather than a
maximization problem, so one can equivalently try to minimize
the negative-log-likelihood which is simply $-1$ times equation \ref{eq:log-likelihood}.

The log-likelihood problem does not assume any particular
distribution function on $\rho$ for the probabilities.  If one assumes
that the outcomes $n_{\alpha,\gamma}$ are large enough to give
a good estimate of the relative probabilities of occurrence for the outcomes
$p^{emp}_{\alpha,\gamma}=n_{\alpha,\gamma}/l_\gamma$ where $l_\gamma$
is the total for all the outcomes in a given basis,
$l_\gamma=\sum_\alpha n_{\alpha,\gamma}$, then one can replace the
log-likelihood function with the least-squares likelihood function\cite{Kosut2004}.

The log-likelihood is a log-convex function of $\rho$ and so the
maximization problem is a convex one.  Unfortunately the
numerical solver we use, SeDumi, is as of yet incapable of
optimizing log-convex problems.  It can, however, solve the
least-squares problem, offering a solution that is quite close to the
log-convex minimizing solution.  From this starting point
we can iteratively search for the log-convex minimizing solution and
we will generally achieve fast convergence.

In practice the least-squares fit is usually a very good approximation
to the log-convex solution, with density matrix terms differing by
less than 1 percent when there are more than 100 counts per basis.
Using the least-squares fit as a starting point, the log-convex
solution could usually be found to within a part in $10^4$ within five
iterations of our search algorithm.  

\subsection{An example: Two-photon polarization density matrix}
We can illustrate these approaches to quantum state tomography  by reconstructing a two-qubit
polarization state from experimental data.  We will use the set
of 36 measurements of coincidence counts taken in the lab and shown in Table \ref{tab:tomo_measurements}.  
\begin{table}{t}
  \begin{tabular}{| c || c | c | c | c || c | c | c | c |}
\hline 
Projection & \multicolumn{4}{|c||}{Counts} &
    \multicolumn{4}{c|}{Frequencies}\\
\hline \hline
    HH/HV/VH/VV & 64689 & 987 & 1019 & 73480 & 0.4615 & 0.0070 &0.0073 & 0.5242 \\ \hline
    HD/HA/VD/VA & 30172 & 53713 & 36952 & 34271 & 0.1945 &
    0.3463 & 0.2382 & 0.2209 \\ \hline
    HL/HR/VL/VR & 34555 & 48378 & 34997 & 39004 & 0.2202 &
    0.3083 & 0.2230 & 0.2485 \\ \hline
    DH/DV/AH/AV & 36285 & 35386 & 27204 & 43428 & 0.2550 & 0.2487 & 0.1912 & 0.3052\\ \hline
    DD/DA/AD/AA & 66755 & 6086 & 5601 & 80338 & 0.4158 & 0.0436 & 0.0367 & 0.5038 \\\hline
    DL/DR/AL/AR & 38466 & 41384 & 34819 & 48034 &  0.2347 & 0.2562 & 0.2127 & 0.2964\\\hline
    RH/RV/LH/LH & 30616 & 40776 & 36182 & 39089 & 0.2074 &
    0.2796 & 0.2447 & 0.2684\\\hline
    RD/RA/LD/LA & 36251 & 39972 & 38273 & 45903 & 0.2245 & 0.2509 & 0.2369 & 0.2877 \\\hline
    RL/RR/LL/LR & 63404 & 4592 & 5262 & 87427 & 0.3902 & 0.0339 & 0.0343 & 0.5416\\\hline
  \end{tabular}
\caption{Laboratory data for the closest state to
  $\frac{1}{\sqrt{2}}\left(\ket{HH}+\ket{VV}\right)$ that we can
  make.  The projections column gives the measurements performed
  on the two photons, the counts column gives the measured
  number of coincidence counts.  The frequencies column gives
  that number of counts normalized to the total number of counts
  measured for that basis.} 
\label{tab:tomo_measurements}
\end{table}

If we apply a linear inversion to this over-complete data set we
obtain the following density matrix:
\footnotesize
\begin{equation}
\rho_{\text{lin}}=\left(
\begin{array}{cccc}
 0.4602 & -0.0033-0.0463i &  -0.0391-0.0044i & 0.4257-0.0095i\\
-0.0033+0.0463i &  -0.0119 &  0.0060-0.0217i &  -0.0149+0.0557i\\
-0.0391+0.0044i &  0.0060 + 0.0217i &  0.0262 & -0.0346+0.0382i\\
0.4257+0.0095i &  -0.0149-0.0557i & -0.0346-0.0382i & 0.5255   
\end{array}
\right).
\end{equation}
\normalsize
The eigenvalues of the matrix are 0.9300, 0.0755, 0.0225 and
-0.0279, meaning that the density matrix is not positive
semi-definite and is therefore unphysical.  In fact, the
negativity of the density matrix is obvious from the negative
value of the population $\rho_{22}$.  

If we instead apply maximum likelihood estimation with a least-squares likelihood
function we obtain
\footnotesize
\begin{equation}
\rho_{\text{ml-ls}}=\left(
\begin{array}{cccc}
 0.4478 & -0.0066+0.0460i & -0.0397+0.0053i & 0.4318+0.0093i\\
 -0.0066-0.0460i & 0.0073 & 0.0043+0.0079i & -0.0090-0.0526i\\
 -0.0397-0.0053i & 0.0043-0.0079i & 0.0314 & -0.0339-0.0370i\\
 0.4318-0.0093i & -0.0090+0.0526i & -0.0339+0.0370i & 0.5135      
\end{array}
\right).
\end{equation}
\normalsize

The values of the individual density matrix elements are quite close to those
for the linear inversion with the crucial difference that this
density matrix is positive semi-definite with eigenvalues of 0.9234, 0.0612, 0.0154 and 0.0000.

If we instead perform correct maximum-likelihood estimation
using the log-likelihood function, we obtain
\footnotesize
\begin{equation}
\rho_{\text{ml-ll}}=\left(
\begin{array}{cccc}
0.4478 & -0.0066+0.0460i & -0.0397+0.0053i & 0.4318+0.0093i\\
-0.0066-0.0460i & 0.0073 & 0.0043+0.0079i & -0.0090-0.0526i\\
-0.0397-0.0053i & 0.0043-0.0079i & 0.0314 & -0.0339-0.0369i\\
0.4318-0.0093i & -0.0090+0.0526i & -0.0339+0.0369i & 0.5135
\end{array}
\right)
\end{equation}
\normalsize
which is almost identical to the least-squares
maximum-likelihood estimate, differing only on in terms of order
$10^{-5}$.  

From this density matrix we can calculate all of the figures of
merit mentioned in Chapter 1.  The concurrence is 0.8585.  The purity is
0.8567.  The von Neumann entropy is 0.1911.  The entanglement of
formation is 0.8010.

We can also estimate the errors by calculating the covariance
matrix for $\rho_{\text{lin}}$.  The statistical errors on the individual
density matrix elements are 
\footnotesize
\begin{equation}
\Delta\rho=10^{-4}\times \left(
\begin{array}{cccc}
5.372 & 4.232+1.300i & 4.427+0.781i & 6.619+0.510i\\
4.232+1.300i & 8.581 & 6.619+0.510i & 3.158+0.680i\\
4.427+0.781i & 6.619+0.510i & 6.412 & 3.186+0.897i\\
6.619+5.10i & 3.158+0.680i & 3.186+0.897i & 8.298  
\end{array}
\right).
\label{eq:sample_dm}
\end{equation}
\normalsize 
That these errors are less that the discrepancy between
$\rho_{\text{lin}}$ and $\rho_{\text{ml-ll}}$ indicates that
systematic rather than statistical errors constitute the dominant error source.

\section{Experimental state estimation}
Much of my early thesis work was devoted to building a
computer-controlled system for doing quantum state estimation.  This
formed the basic experimental system used throughout this thesis.
This section will explain the functioning of the system, describe
measurements taken on it and discuss possible improvements.

\subsection{Spontaneous parametric downconversion}
The discovery of Bell's inequalities\cite{Bell1964} in the mid-sixties triggered a
search for sources of entangled particles that could be used to test
them.  Realistic proposals for testing Bell's inequalities with photons were published in
1969\cite{Clauser1969} and eventually realized in 
atomic cascade systems\cite{Freedman1972,Aspect1981}.  Compared with
modern sources, these sources were limited in their rates by the fact that
emission was isotropic and hence weak in any particular measurement
direction, and by their narrow resonances and hence limited
re-excitation rate.   Additionally, these cascade sources were inherently
non-degenerate in frequency, limiting the sorts of two-photon
interference effects that could be implemented with them.  The discovery of spontaneous parametric
downconversion (SPDC)\cite{Burnham1970} in $\chi(2)$ non-linear materials
delivered solutions to both of these problems while providing a much more
convenient and efficient source of paired photons.  SPDC was initially
used \cite{Hong1987,Steinberg1992} for two-photon interference experiments, but
a much wider range of experimental possibilities was opened up
by the
development of polarization-entangled SPDC sources\cite{Kwiat1995,Kwiat1999}.  These
sources allowed very large violations of Bell's inequalities\cite{Kwiat1999},
including violations over large distances\cite{Weihs1998}, closing the communication
loophole.  They allowed entanglement to be studied in much
greater depth than it had been up to that point\cite{White1999}.  They also led very
quickly to four-photon\cite{Pan2001}, and later six-photon\cite{Zhao2004}, sources which were used to demonstrate
teleportation\cite{Pan2001}, GHZ-type entanglement\cite{Pan2000}, cluster state quantum computing\cite{Walt2005},
quantum logic-gates\cite{OBrien2004}, quantum algorithms\cite{Lanyon2007} and a wide host of other
technologies.  While there have been exciting developments in
alternative sources of photons such as quantum dot sources\cite{Santori2002}, cavity QED sources\cite{McKeever2003}, and
sources making use of $\chi(3)$\cite{Li2005} or spontaneous Raman
scattering\cite{Chou2004}, $\chi(2)$ SPDC remains the standard tool
for studying entanglement, quantum information and linear optical
quantum computing systems.

Spontaneous parametric downconversion can be thought of as the
time-reversal of sum-frequency generation.  Sum-frequency generation
is a classical effect that occurs when the
mechanical response of the charge in a dielectric medium is non-linear
in the applied field.  This results in a polarization field in the
material with frequency components that were not present in the applied field.
Since the non-linear response of all ordinary materials is extremely
weak, this response is very well approximated by the first
non-linear term in the Taylor series expansion of the material polarization.  In materials that lack
inversion symmetry, such as many crystals, this is a term quadratic in
the applied field.  Such materials are called $\chi(2)$ materials.  In centro-symmetric materials like glasses
this is a term cubic in the applied field and the materials are called
$\chi(3)$ materials.  In $\chi(2)$ materials, sum-frequency
generation involves two input
fields with frequencies $f_1$ and $f_2$
generating an output field with a frequency of $f_3=f_1+f_2$.
In the special case that $f_1=f_2$, sum-frequency generation is called
second-harmonic generation.

Time-reversal invariance implies that
it ought to be possible to put a frequency $f_3$ into a $\chi(2)$
non-linear material and see two frequencies $f_1$ and $f_2$ come out
with $f_1+f_2=f_3$.  Indeed, Maxwell's equations predict that if two seed
beams with frequencies $f_1$ and $f_2$ are present at the input to the
medium then they will both be amplified, an effect known as optical
parametric amplification.  When quantum theory is
applied to this situation, the main
result is that light at $f_1$ and $f_2$ is generated even in the
absence of a seed
field.  One interpretation of this is that the seed is provided by
quantum vacuum fluctuations of the electromagnetic field.  Thus in
quantum theory a $\chi(2)$ material illuminated with light at a
frequency $f_3$ will generate a polarization field at \emph{every} pair of frequencies
$f_1$ and $f_2$ such that $f_1+f_2=f_3$.  The two SPDC fields are, for
historical reasons, called signal and idler fields.  

Because the excitations of the field are quantized, the non-linear interaction
must destroy a whole photon from the pump and create whole photons in
the signal and idler beams when the interaction takes place.
This effect, together with Planck's Law $E=hf$, means that the 
relationship $f_i+f_s=f_p$ is simply an expression of
conservation of energy at the single-photon level, 
\begin{equation}
 E_s+E_i=E_p.
\label{eq:energy_conservation}
\end{equation}

So far the discussion of polarization fields has considered their
generation, but not their propagation.  In order for a polarization
field to propagate in the crystal, the polarization fields generated by
the pump at different locations in the crystal must interfere
constructively in a given direction.  Since the polarization created
locally by the pump will be in phase with the pump, the polarization
field will only propagate if it remains in phase with pump during
propagation.  This amounts to a requirement that the wave-vectors
of the signal and idler sum to the wave-vector of the pump.
Interestingly, this puts no restrictions on the individual
wave-vectors of the signal and idler, only their sum.  The requirement
can be expressed as 
\begin{align}
\vec{k_p}&=\vec{k_s}+\vec{k_i}\notag\\
\hbar \vec{k_p}&=\hbar \vec{k_s}+\hbar \vec{k_i}\notag\\
\vec{p_p}&=\vec{p_s}+\vec{p_i},
\label{eq:momentum_conservation}
\end{align}
where we have used de Broglie's relationship to show that this
condition on the wave-vectors amounts to nothing more than momentum conservation for the
three photons.  Satisfying this condition is known as \emph{phase-matching}.

Most dielectric materials have normal dispersion, meaning that
frequency increases sub-linearly with wave number $k_0=|\vec{k_0}|$.
This makes
equation $\ref{eq:momentum_conservation}$ impossible to satisfy since
it means that $k_p$ will always be greater than $k_s+k_i$.  A clever
solution to this problem is to employ birefringent materials\cite{Boyd1965} which
have the property that different polarizations have
different phase velocities. If the pump is polarized along the
direction with a relatively fast phase velocity, then there will
generally be some emission directions for which the signal and idler will be
phase-matched at some frequencies.   Not all $\chi(2)$
materials are birefringent\footnote{The most notorious case of this is
  GaAs which has a very large $\chi(2)$ non-linear susceptibility, but has so
  far proved impossible to phase-match, despite a considerable amount
  of effort expended on the problem.}, nor do all $\chi(2)$ materials that are
birefringent have non-zero non-linear susceptibility for the
combinations of polarizations that can be phase-matched.  There are,
however, many materials for which birefringent phase-matching is a
possible with some of the most popular being beta-phase barium
borate ($\beta$-BBO), lithium borate (LBO), potassium titanyl
phosphate (KTP), lithium niobate (LiNbO$_3$) and bismuth borate (BiBO). 

In most birefringently phase-matched materials there are two ways to
select polarizations of the pump, signal and idler to achieve
phase-matching.  In type-I phase-matching, the signal and idler have
the same polarization and both are orthogonal to the pump
polarization.  In type-II phase-matching the signal and idler have
opposite polarization, and the pump is parallel to one of them and
orthogonal to the other.  Most birefringent materials allow either
type of phase-matching to be used, although a few materials allow
type-I but not type-II due to crystal symmetry (i.e. no non-linear polarization
field develops in the directions required to give the signal
and idler polarizations).  When both types of phase-matching are
allowed, type-I tends to have a slightly larger component of the
non-linear susceptibility tensor. 

The two types of phase-matching lead to very different SPDC emission
patterns due to the behavior of light in birefringent materials.
Type-I emission is the more straightforward of the two.  Conservation
of momentum dictates that if the signal and idler are degenerate in
wavelength then they must make the same angle with the pump $k$-vector
in order to conserve momentum.  The emission pattern is therefore in
the shape of a cone with the angle of emission being determined by
simultaneously solving equation \ref{eq:energy_conservation} and
equation \ref{eq:momentum_conservation} along with the birefringent
dispersion
relations $n_e(\lambda)$ and $n_o(\lambda)$.  In Type-II
phase-matching in a uniaxial crystal one
downconverted photon will be extraordinarily polarized while the other
will be ordinarily polarized.  Because the two photons see different
indices of refraction, even when they are degenerate in frequency, the
emissions cones are not coincident on each other.  One is centered
above and the other below the pump direction where above and below are
in the direction of the optical axis. By adjusting the angle between the crystal axis
and the pump, the cones can
be made to overlap\cite{Kwiat1995} or can be reduced to point-like regions\cite{Take2001}.
\ignore{
Figure \ref{phasematching_sim} shows simulations of the far-field
intensity of degenerate emission from
a $1$ mm BBO crystal pumped at 405 nm for both type-I and type-II
phasematching. 
} 
 
\subsection{Polarization-entangled SPDC sources}
While early experiments using SPDC sources examined the
time-frequency and position-momentum entanglement\cite{Kwiat1993,Rarity1990},
the development of polarization-entangled sources revolutionized
SPDC's role in quantum information and quantum computing.  The reason
is that polarization provides an easily manipulated, easily measured
qubit and SPDC sources can provide entangled two-qubit states
of unparalleled quality\cite{Groblacher2007}.  

Both kinds of SPDC phasematching can be used to create entangled
photons.  Type-II phasematching provides entangled photons without any
additional effort so long as the phase-matching is chosen so as to
make the ordinary and extraordinarily polarized cones intersect at two
points.  If light is collected from the two intersection points there
is an amplitude for point 1 to contain the extraordinarily polarized
photon and point 2 to contain the ordinarily polarized photon and
another amplitude for point 1 to contain the ordinary photon and point
2 to contain the extraordinary photon.  The relative phase between
these two amplitudes will depend on the details of the phase matching
and the length of the crystal, but can be easily adjusted by applying
a birefringent phase shift to one of the two photons.  The result is
an entangled photon state of a very high quality\cite{Kwiat1995}.

Type-I SPDC is not naturally polarization-entangled since both the
SPDC photons have the same polarization.  A clever idea due to Kwiat\cite{Kwiat1999},
though, was to sandwich two Type-I crystals together with their axes
at $90^\circ$ to each other.  A pump polarized at $45^\circ$ to the
two crystal axes will excite non-linear polarization fields in both crystals
with a definite phase relationship between them.  If the emission
rates are low then downconversion will only occur in one of the
crystals at a time, but there will be a definite phase between the amplitudes
for downconversion in each crystal.  If the output of the two crystals
overlaps then at any two points on the emission cone there is an equal
amplitude for the photons to have come from either of the two crystals
and hence for both to be horizontally polarized or both to be
vertically polarized.

This source has the advantageous property that \emph{every} pair
of photons that it produces is entangled and, what is more, it allows the
degree of entanglement to be controlled losslessly by changing the polarization
of the pump.  This makes this source especially useful in systems that
want to examine non-maximally entangled states\cite{White1999}
or experiments where the degree of entanglement needs to be controlled.  
The experiments described in chapters 4 and 5 make use of a
type-I entangled source while those described in chapter 3 make use of a
type-II unentangled source.  

\subsection{Real-world sources}
While SPDC sources are capable of supplying very high-quality
entangled states, they are subject to imperfections that can
limit entanglement or otherwise produce less-than perfect
photons.  This section will explain the effects that need to be
considered in order to correct source imperfections and present
some results on trying to improve our sources.  This is an ongoing
process,  with new techniques for
improving sources being a very active area of current
research\cite{Altepeter2005_1,Mosley2008}.  The sources we use currently are
still quite far from optimal, and improving them will be an
ongoing project for the next few years.

Since our experimental work made use of two specific sources for
two very different applications, we will first discuss each
source and its desired properties and then consider the
experimental imperfections that will affect these properties.  
Since 2005 I have been working exclusively on the Type-I source, so
most of the discussion will be centered on describing it and less on
the Type-II source.

\subsection{Type-I sandwich source}
The type-I source used in chapters 4 and 5 consisted of two $2.5$
mm long pieces of $\beta$-BBO glued together by the crystal manufacturer
Crystran.  The crystals were rotated
$90^\circ$ to each other and the optical axes were tilted
$29.2^\circ$ away from from the surface normal.  The particular
choice of optical axis angle was chosen so that
the crystals would be phase-matched for degenerate type-I SPDC
at $810$ nm at a $3^\circ$ opening angle with a 405 nm pump at normal
incidence.  In practice it
was found that one of the crystals emitted at $810$ nm at
normal incidence while the other had to be tilted approximately $0.8^\circ$
away from normal to achieve phase-matching for the same opening angle.  This is likely
due to a manufacturing error, but the cause of the discrepancy was
never fully investigated.

The source was pumped with a
28 mW output Coherent Vioflame laser containing a gallium-nitride laser diode
chip selected to have a $405\pm0.3$ nm center wavelength.  The
laser spectrum as measured by an Ocean Optics HR2000 spectrometer is
shown in figure \ref{fig:405nm_laser_spectrum}.  The spectral
width is $0.5$ nm with a corresponding coherence
time of $1.1$ ps or a coherence length of $330$ $\mu$m.

\begin{figure}[!t]
  \centerline{
    \mbox{\includegraphics[width=\columnwidth]{Figures/vioflame_spectrum.eps}}
  }
  \caption{Measured spectrum of 405 nm pump laser.  Centre frequency
    is found to be 404.8 nm with a full-width at half maximum bandwidth of 0.5 nm.}
  \label{fig:405nm_laser_spectrum}
  \end{figure}


Before being sent into the crystal, the laser was polarized with
a Glan-Thomson calcite polarizer (Thorlabs GTH10M) and then
rotated to the desired polarization with a $405$ nm true
zero-order quartz half-waveplate from FocTek.  

SPDC emission from the crystal is very broadband.  It was estimated to
be approximately $60$ nm based on observed reduction in counts when
$13$ nm interference filters were use to reduce the spectrum. In
principle, the spectrum could be measured accurately using the
Hong-Ou-Mandel effect \cite{Hong1987}, but this measurement has not
been undertaken.  

\subsubsection{Longitudinal walkoff}
In order for the SPDC
from the two crystals to be coherent, the emission from the crystals
must be spatially and temporally indistinguishable.  Naively one might
think that this would involve matching the group delays in the
two crystals to better than an SPDC coherence length.
Surprisingly this is not the case -- even if emission from one
crystal is significantly advanced with respect to the other, the
emission amplitudes for the two crystals will be coherent, so
long as the delay between them is less than the coherence length
of the pump\cite{Kwiat1999}.  The reason for this that with a
continuous wave (CW) pump the SPDC
emission time is completely random, so it makes no
sense to talk about early or late emission from the two
crystals.  Even if emission from the two crystals experiences wildly
different group delay, this delay generates no information about which
crystal produced the downconversion and so coherence between the
two possibilities is conserved.  However if the pump has a finite
coherence time then emission amplitudes
at delays larger than that coherence time will have random
relative phases and will not be coherent.  In our crystal
geometry, a pair of SPDC photons downconverting in
the middle of the first crystal will experience a total optical group
delay of 8.272 mm before exiting whereas a pair created in the middle
of the second crystal will experience an optical group delay of 8.588 mm.
This includes the group delay that the pump experiences before
the downconversion happens.  Thus there is a difference in the
group delays of 318 $\mu$m.  This can be compared with the 330
$\mu$m coherence length of the pump.  

Only the extraordinarily polarized component of the pump will
create SPDC in each of the crystals.  It is therefore possible
to precompensate for this decoherence in the SPDC by delaying
one component of the pump polarization relative to the other.
This can be done simply by inserting a piece of birefringent
material with its axis aligned with the SPDC crystal axes.  In
our experiment a piece of $\alpha$-phase BBO was used for this
purpose.  One
can calculate the expected concurrence with and without the
compensating crystal.  With the compensator the expected
concurrence is 0.963 and without it it is 0.274. The expected
concurrence with the compensator is significantly
higher than maximum measured concurrence of 0.89, indicating
that there are other sources of decoherence besides longitudinal walkoff.

\subsubsection{Transverse walkoff}
In addition to longitudinal walkoff, the crystals exhibit
transverse walkoff.  While longitudinal walkoff results from a
frequency-dependent phase, transverse walkoff results from a
direction-dependent phase.  In uniaxial birefringent
materials, extraordinary rays experience an index of refraction
that depends on the angle the ray makes with the optical axis.
This dependence of index on angle results in a dependence of the
phase of the extraordinary beam on propagation direction.  The
linear component of this phase dependence causes the center of the
beam to `walk off' in the
transverse direction.  This effect is easy to see in thick pieces
of quartz or calcite.  An unpolarized light source viewed
through a thick piece of calcite will appear in double because
the extraordinarily polarized part of the image will have walked
off relative to the ordinarily polarized part.  It is also the
principle used in making calcite polarizers and polarizing
beamsplitters.

In type-I sandwich SPDC the effect causes the SPDC produced in
the first crystal to be displaced relative to the SPDC produced
in the second crystal since the SPDC from the first crystal will
be extraordinarily polarized in the second crystal.  Moreover,
the non-local phase between the extraordinary and ordinary
polarizations will be spatially-dependent in the far field.  This
means that for each iris position, the phase $\phi$ in the measured
state $\frac{1}{\sqrt{2}}\left(\ket{HH}+e^{i\phi}\ket{VV}\right)$ will
be different.  This effect can be seen in figure
\ref{fig:phase_versus_iris_position} where the measured non-local
phase is plotted against the position of one of the detector irises.
When SPDC is collected over a finite collection
aperture, this spatially-dependent birefringent phase results
in reduced entanglement and reduced purity.  The effect can be
overcome by reducing the collection aperture, but this also
reduces the total number of counts.  Figure
\ref{fig:entanglement_versus_aperture} shows a plot of the
concurrence of an entangled state versus the collection aperture
size and compares it to the concurrence expected from the phase
dependence on iris position.  The measured concurrence is considerably
lower than the expected concurrence, so there is clearly some other
decoherence mechanism operating.

\begin{figure}
\resizebox{\columnwidth}{!}{\includegraphics{Figures/phase_versus_iris_position.eps}}
  \caption{Plot of the measured non-local phase $\phi$ in
    $\frac{1}{\sqrt{2}}\left(\ket{HH}+e^{i\phi}\ket{VV}\right)$
    against iris position.  The linear dependence is expected due to
    the transverse walkoff experience by downconversion from the first
    crystal in the second crystal.}
\label{fig:phase_versus_iris_position}
\end{figure}

\begin{figure}
\resizebox{\columnwidth}{!}{\includegraphics{Figures/concurrence_vs_iris_size.eps}}
  \caption{Plot of the measured concurrence against the diameter of
    the two collection apertures.  The theory is calculated based on
    the measured phase dependence from figure
    \ref{fig:phase_versus_iris_position}.  The discrepancy indicates
    that some other iris diameter-independent decoherence mechanism is
    also playing a role.}
\label{fig:entanglement_versus_aperture}
\end{figure}

Another option that is currently being explored as an upgrade
to the setup is to collect into single-mode fiber instead of
through an aperture.  A single-mode fiber would act as a perfect
spatial filter, eliminating any spatial variation in phase.  This
option will also be lossy, although perhaps less-so than using
very small apertures without focusing.

A better solution, proposed by Altepeter, Jeffrey and
Kwiat\cite{Altepeter2005} was to use additional crystals after the
SPDC crystal to apply a direction-dependent phase shift
that exactly cancels the one obtained from the SPDC crystals.
These extra crystals would in turn create an additional
longitudinal walkoff that would also have to be compensated with
yet another set of crystals.  We attempted this for our setup using
calcite crystals cut with axes at $45^\circ$, but the addition of the
crystals was not found
to increase the concurrence.  The reason why this did not
work is unclear.  One orientation of the crystals
was found to reduce the concurrence, but no orientation was
found that could increase it.  Two conclusions are possible: either the transverse walkoff is
not the limiting factor in the concurrence or the calculated
crystal lengths were incorrect.  More investigation will be
needed to determine which of these possibilities is the right
one.
\subsubsection{Alignment of two crystals}
To align the two SPDC crystals the following procedure was used:
\begin{enumerate}
\item Align the crystal axes with a polarization reference by
  putting the crystal between crossed polarizers and minimizing
  the transmission.   
\item Polarize the pump along the axis of one of the crystals so
  that only one crystal generates SPDC.  Align the pump laser to retroreflect off of the crystal
  surface.
\item Place an iris a few tens of centimeters away from the
  crystal at a 3 degree angle to the pump
\item Align a visible alignment laser so as to pass through both the
  irises and the crystal
\item Direct the alignment beam to the detectors
\item Align the lenses on the detectors by looking through the
  lens and trying to produce a clear image of the detector element
\item Maximize the photon counts due to the alignment beam 
\item Turn off the alignment beam.  If everything was done
  carefully there should be some counts from SPDC.  Adjust the
  lenses and mirror so as to maximize these counts.  Once the
  rates at the individual detectors are optimized there should
  be some coincident detection events and the optimization
  procedure can be repeated to maximize these.
\item Insert spectral filters before the detectors.  Hopefully
  there are still some coincidence counts even with the filters
  in place.  Adjust the crystal angles in the plane of the pump
  polarization and the pump direction to maximize the counts.
\item Rotate the pump polarization by $90^\circ$.  Tilt the other
  crystal axis direction so as to maximize the coincidences seen at
  the detectors.  If the pump polarization is now rotated to $45^\circ$, both crystals
  will emit and the two-photon state will be entangled.
\end{enumerate}

In recent discussions with other groups working on
SPDC\cite{Barbieri2008} it has
become clear to me that crystal and detector alignment is
significantly easier with single-mode
fibers.  With single-mode fibers, an alignment laser can be sent
backwards through the fibers destined for the detectors and the
focusing optics adjusted so as to obtain an intersection of the
beam waists for the pump and the two alignment lasers.  The
waist sizes can also be adjusted so as to be the same or so as
to have a smaller waist for the SPDC than for the alignment beams
which is predicted to be the optimal configuration\cite{Dragan2004}.  The
spatial frequency of the standing-wave interference pattern created by the two alignment
beams gives an accurate measure of the angle between them,
allowing a very precise definition of the geometry, and the overlap of
the beams can be made very accurate by having all of them pass through
the same pinhole.  When the
crystal is inserted into this setup, coincidences should be
immediately observable.  The next SPDC experiments undertaken in
the Steinberg group should definitely try to make use of this
technique.  In order to make this work, a pump laser that is either
fiber-coupled or circularized so as to make coupling into fiber more
efficient would be a significant asset.  The Coherent vioflame lasers
have highly elliptical emission and could not be coupled into
single-mode fiber with
more than 20\% efficiency, even when anamorphic prisms were used to
try to correct the beam shape. 

\subsection{Source output}
The type-I source has been in regular use since it was built in
2004.  A typical entangled density matrix is the one
reconstructed in equation \ref{eq:sample_dm}, while a typical CHSH
correlation measurement\cite{Clauser1969} is shown in Figure
\ref{fig:sample_bell_measurements}.   The $90.1\pm0.5\%$ and $89.7\pm0.5\%$ visibility of these
fringes is a typical signature of Bell inequality
violation\cite{Aspect1981}, and leads to a value of the CHSH\cite{Clauser1969} S
function of $2.61\pm0.02$, clearly violating the local realistic maximum
value of 2.  

Although our source has a reasonable degree of
entanglement suitable for our quantum information experiments,
it still falls short of the results obtained by James et al.\cite{James2001}
using 500 $\mu$m long crystals and a narrow-band Ar-ion pump with no
compensation for either longitudinal or transverse walkoff.
While they achieved a concurrence of $0.963\pm0.016$ our
concurrence was $0.86\pm0.02$.  This number is comparable to the
degree of entanglement seen by at least one other source using long
crystals and a unstabilized diode laser pump\cite{Resch2008}.  

While the calculations of this chapter make clear that the current setup
is still showing entanglement less than what ought to
theoretically be
achievable, the results of James et al. make it clear that by
eliminating longitudinal and transverse walkoff by having a long
pump coherence length and short crystals it is possible to
achieve nearly perfectly entangled states.   Modifying out setup
to collect into single-mode fiber with a narrow-band grating-stabilized pump
ought to be able to match or exceed the quality of entanglement that
they achieved.    
\begin{figure}
\psfrag{aaaaaaaa1}[Bl][Bl][0.5][0]{$\alpha=0^\circ$}
\psfrag{aaaaaaaa2}[Bl][Bl][0.5][0]{$\alpha=90^\circ$}
\psfrag{aaaaaaaa3}[Bl][Bl][0.5][0]{$\alpha=45^\circ$}
\psfrag{aaaaaaaa4}[Bl][Bl][0.5][0]{$\alpha=-45^\circ$}
\psfrag{Anglekok}[Bl][Bl][0.5][0]{Angle($^\circ$)}
\includegraphics{Figures/chsh.eps}
\caption{CHSH measurements.  Half-waveplates are placed in front of
  polarizers at detectors A and B.  The different curves are for
  waveplate A at angles $0^\circ$, $22.5^\circ$, $45^\circ$ and $67.5^\circ$
  degrees (effectively creating a polarizer at the angle $\alpha$ in
  the legend).  The $x$-axis gives the angle of waveplate B.  The value the curves at angle $11.25^\circ$
  and $33.75$ can be used to calculate the CHSH S-function.  The high
  visibility of $90\%$ for $\alpha=\pm 45^\circ$ is
  indicative of a Bell's inequality violation.}
\label{fig:sample_bell_measurements}
\end{figure}


\subsection{Type-II beam-like emission source}
The purpose of the type-II source used in the experiments in
Chapter 3 and in several of the
other experiments in our group\cite{Adamson2007,Shalm2008,Mitchell2004}
was to produce
non polarization-entangled photons that would be coupled into
single-mode optical fiber.  The photons were produced by SPDC of a
doubled 50 fs pump pulse from a Ti:Sapph oscillator running at an 82 MHz repetition rate \footnote{Some modifications to the
  laser undertaken in 2005 increased the pump pulse length to 100 fs
  and reduced the repetition rate to 40 MHz.  This laser was then
  replaced with a Coherent Mira.}.  For reasons to be
explained in more detail in
Chapter 3, it was desirable that the SPDC photons be
temporally indistinguishable from the pulses produced directly
from the Ti:Sapph.  

In type-II phase-matching, changing the angle between the
crystal axis and the pump will change the opening angle of the
SPDC cones at a given wavelength.  When that angle is made small
enough, the cones collapse to spots, a
phenomenon first documented by Takeuchi\cite{Take2001}.  At the
time the pulsed experiments in our lab were being set up it was
believed that this sort of emission would prove optimal for
coupling into single-mode fiber. Indeed better coupling was obtained
for this geometry than for the Type-II collinear geometry, although it
was difficult to determine whether coupling had been truly optimized
in either situation\cite{Kurtsiefer2001}.  Recent theoretical
work has provided some better tools for analyzing such
questions\cite{Ling2008_2,URen2007}.

\section{Photon detection}
The photodetectors used in all the experiments presented in this
thesis were Perkin Elmer Single Photon Counting Modules (SPCMs) based on
an actively-quenched silicon avalanche photodiode (APD) operated
in `Geiger mode'\cite{Cova1996}.  Avalanche photodiodes are photodiodes in
which electron-hole pairs are accelerated by a large applied
reverse-bias field which can create
additional electron-hole pairs by impact ionization.  This
allows a signal consisting of many electron-hole pairs to be
generated by a single photon.  An APD in Geiger mode has a
reverse-bias field so large that it exceeds the breakdown
field of silicon.  In such a device a single electron-hole pair is
sufficient to trigger a
macroscopic current of around a milliamp which can then be
picked up by standard discriminator electronics, triggering a pulse.  A photon
incident on a Perkin-Elmer detector will produce a 25 ns long 4V signal capable of
driving a 50$\Omega$ transmission line.  While the pulse is 25
ns long, the rise time jitter between the arrival of the photon
and the beginning of the pulse was measured to be between $300$ and
$600$ ps, depending on the detector.

The spectral response of the detectors taken from the detector
datasheet\cite{SPCMDatasheet} is reproduced in figure
\ref{fig:detector_spectral_response}. 
\begin{figure}
\resizebox{\columnwidth}{!}{\includegraphics{Figures/detector_spectral_response.eps}}
  \caption{Spectral response curve for Perkin-Elmer
    single-photon counting module}
\label{fig:detector_spectral_response}
\end{figure}
At our operating wavelength of $810$ nm we expect a detection
efficiency of roughly $60\%$.

The avalanche nature of the detectors makes them insensitive to
the strength of the incident pulse.  The detectors are not
number-resolving and so can only
distinguish no photons from some photons. 
Occasionally the detectors will fire even in the absence of an
incident photon.  Such events, called dark counts, are due
to crystal and surface defects in the semiconductor.  For the
detectors used in these experiments dark counts were between a
few dozen and a few hundred counts per second whereas (unpaired) SPDC
counts were several tens of thousands per second.  For most practical
purposes the detector dark counts could be safely ignored.

The active detector area of the SPCM is a square element 100
$\mu$m on a side.  For the
fiber-coupled detectors used in the
experiments in Chapter 3, a fiber coupler was mounted to the
detector and simply butt-coupled to this element.  For the
free-space experiments in Chapters 4 and 5, the incoming SPDC
light was coupled to the detectors through a 5 cm focal-length
lens mounted on a three-axis translation stage.  A magnified
image of the detector element was clearly visible through the
lens, a fact that was used in the alignment of the system.

\subsection{Coincidence detection}
The very tight sub-picosecond timing correlations\cite{Hong1987} between two photons in an
SPDC pair allow the pairs to be very accurately filtered out
from background light and detector dark counts by coincidence
detection.  For this to work, the pulses from the two detectors
must be fed to an electronic circuit that fires when two pulses
arrive at the same time.  Ideally such a circuit should be able
to filter out coincident detection to within the detector jitter
time of \~{}$500$ ps.  A coincidence circuit is characterized by a
coincidence window, the maximum time difference that can occur between two
pulses before they are filtered out.  For any coincidence window
larger than zero, background light will cause some level of
accidental coincidences when two background photons happen to
trigger the detectors within a time less than the coincidence
window.  Assuming the background rates at the two detectors are
uncorrelated, the rate of accidental coincidences will
be 
\begin{equation}
R_C=R_1 R_2 \tau_c
\end{equation}
where $R_C$, $R_1$ and $R_2$ are the rates of accidental coincidence
and independent detections at the two detectors and $\tau_c$ is
the coincidence window.  The main source of accidental
coincidences is between unpaired SPDC events.  For instance, in
the type-I system, the conditional
probability of detecting a photon at one detector given that one
was detected at the other detector ranged from $0.1\%$ to $12\%$,
depending on the iris sizes.
This meant that the rate of unpaired SPDC photons was $10$ to $1000$
times higher than the rate of coincidences.  

When I first arrived at the lab, coincidence detection
was done by a crude discrete TTL logic gate circuit that simply
applied an AND operation to the two pulses.  The resulting
coincidence window was $50$ ns, so that with typical singles
rates at the detectors of $30,000$ per second the coincidence
rate was $45$ which represented a significant background for
tomography, two-photon interference, cryptography and
entanglement measurements, given that the coincidence rate was on the
order of ~120 coincidences per second.  

In 2006 I designed and built a more sophisticated coincidence detection
circuit that used a high-pass filter to isolate the rising edge
of each signal and fast PECL logic to perform the AND operation
between them.  It also expanded capacity to measure coincidences
to four channels so that light could be collected from both
output ports of a polarizing beamsplitter in the analyzer.  With
this circuit I was able to reduce the coincidence window to a
minimum of 1.2 ns, although working with such a short
coincidence window reduced the overall coincidence rate.  With a
3 ns window rates were unreduced, and the accidental coincidence rate
brought down to $~3$ per second.  

The schematic and printed circuit board layout of the circuit can be
found in the appendices.  A high-pass RC filter
differentiates the input signal from each of two detectors and
the resulting signal is discriminated by a high-speed
comparator.  The comparators drive the inputs of a PECL-logic
AND gate.  When the AND gate triggers, a RS latch, consisting of
two PECL NAND gates, is set.  An 80 MHz FPGA continuously polls
these latches and resets them once they are read.  The FPGA runs
counters for each of the signals from the latches and the inputs
and sends the counter values to a Delcom 802600 microcontroller
which passes them on to the computer over USB.  
\ignore{
\begin{figure}[!t]
  \begin{center}
  \subfigure[Schematic diagram]{\label{fig:coincidence_circuit}\includegraphics[scale=0.5,angle=270]{Figures/coincidence_schematic.eps}}
  \subfigure[PCB Layout]{\label{fig:coincidence_pcb}\includegraphics[scale=0.5,angle=90]{Figures/coincidence_board.eps}} \\
  \end{center}
  \caption{Schematic diagrams and PCB layout for fast coincidence
    detection electronics}
  \label{fig:coincidence_design}
  \end{figure}
}

\section{Polarization manipulation}

\subsection{Polarization projection}
A polarizer is an optical element that transmits
one polarization component while either absorbing or reflecting
the other.  Typically polarizers project onto a linear basis,
acting as either polarization filters or polarizing
beamsplitters (PBSs).   

There are four kinds of polarizer commonly used in our lab.
\begin{enumerate}
\item {\bf Birefringent crystal polarizing beamsplitters and
  polarizers:} These polarizers are typically made of highly
  birefringent materials like calcite and rely on the large
  transverse walkoff in these materials to separate the two
  polarizations.  Typically two crystals are cut in a prism
  configuration to enhance the spatial splitting between the two
  polarizations.  In the Glan-Thomson configuration the
  extraordinary beam is directed into the side of the mount
  while the ordinary beam is transmitted with a small spatial
  displacement.  In Glan-Taylor configuration the extraordinary
  beam is reflected out of the prism at an angle while the
  ordinary beam is transmitted, resulting in a polarizing
  beamsplitter.  Extinction ratios of $1:10^5$ or better are
  claimed for these polarizers, making them an order of magnitude
  better than other polarizers.  They are also extremely
  broadband, being limited only by the material absorption of
  calcite and the quality of the anti-reflection coating.  This
  is the \emph{only} type of polarizer I would recommend
  using in any future experiments.  The only downside is the
  slight displacement of the transmitted beam, and for
  application that are sensitive to this, a plate polarizer can
  be used instead.
\item {\bf Dielectric polarizing beamsplitters:} This is the type
  of PBS that I initially used.  It consists of multiple layers
  of glass and relies on Fresnel reflection\cite{BornWolf} at Brewster's
  angle to reflect only the $s$-polarized light.  After multiple
  reflections the reflected light should all be $s$-polarized and
  the transmitted light all $p$-polarized.  I tried many such
  polarizers before giving up on them altogether. Not only was
  the polarization of the reflected light extremely sensitive to
  incidence angle, detection of the transmitted light did not even
  project onto a linear polarization state, but rather typically
  one with a $3-4\%$ circular component.  This can be seen in
  plot of figure \ref{fig:comparing_polarizers} which shows the
  transmission of light between parallel polarizers as a
  quarter-waveplate is rotated between them.  Figure
  \ref{fig:calcite_polarizer} shows the transmission when
  the second polarizer of the parallel pair was a Thorlabs GTM10 Glan-Thompson
  polarizer while figure \ref{fig:dielectric_polarizer} shows
  the result when it was a Thorlabs PBS3 dielectric polarizing
  beamsplitter.  The troughs of the curves indicate where the
  light output from the quarter-waveplate is right or left
  circularly polarized.  A linear polarizer will have $50\%$
  transmission for either of these, but the dielectric polarizer
  preferentially transmits the one circular polarization more
  than the other.  This kind of thing wreaks havoc on a
  polarization analyzer because whereas rotations of linear
  polarizations can be easily corrected with a half-waveplate,
  rotations to elliptical polarizations generally require three
  waveplates to fix.  This means that three mutually dependent
  angles have to be properly set.  Finding these angles was found to
  be an exercise in futility.
\item {\bf Plate polarizers:}
Plate polarizers work quite well, but they are lossy elements
that absorb one of the two polarizations.  Typically they
contain a material, either polymer or metal, that conducts in
one direction leading to strong absorption for one polarization and
weak absorption for the other.  The main advantage these have
over calcite polarizers is that they do not displace the beam,
making them useful for dropping in to an interferometer without
affecting alignment.  The plate polarizers tested were found to
project onto linear bases, but with extinction ratios measurably lower
than for the calcite polarizers.  Measured extinction ratios varied
from $~10^{-3}$ for the Edmund optic Polaroid polarizers to $~10^{-4}$
for the Polarcore polarizers.
\end{enumerate}

\begin{figure}
\begin{center}
\subfigure{\label{fig:calcite_polarizer}\includegraphics[scale=0.6]{Figures/QWP_calcite.eps}}
\subfigure{\label{fig:dielectric_polarizer}\includegraphics[scale=0.6]{Figures/QWP_dielectric.eps}}
\caption{Comparison of calcite and dielectric polarizers.  The unequal
transmission of left and right circular light for the dielectric
polarizer is difficult to correct for in a polarization analysis system.}
\label{fig:comparing_polarizers}
\end{center}
\end{figure}

\section{Polarization rotators}
A polarizer makes a measurement along a ray on the Poincar\'e
sphere.  In order to obtain a complete set of polarization
measurements, a given measurement must be rotated around two
non-collinear axes on the Poincar\'e sphere.  The usual way to
accomplish this is to use two waveplates, a quarter-waveplate
and a half-waveplate and to rotate them to different angles.
With good waveplates and accurate positioning, measurements can
be made accurate to $0.1\%$ or less.

An alternative approach that we tried with some success was to use
two liquid crystal waveplates at fixed angles and to vary the
birefringent phase delay induced by the liquid crystal.  This
had the advantage of being much faster since the system had no
moving parts.  In general, though, the liquid crystals were
found to be of poorer quality than waveplates in terms of
extinction through parallel polarizers.  Additional effects such
as long-term temperature drift and slight non-repeatability of phase control also contributed
to errors.  Generally, though, these effects amounted to a 1\%-2\%
error and were tolerable for many measurements.

Where liquid-crystal waveplates were most useful was
in creating random or mixed polarization states by applying
pseudo-randomly generated voltages to them.  Since the waveplates could typically
be switched at a rate comparable to the incident photon pair
flux, a liquid crystal could turn a polarized photon pair train
into completely unpolarized one.  

In the following sections we will discuss the experimental considerations
involved in using these active optical elements for polarization
control.  
\subsection{Waveplates}

\subsection{Motorized rotation stages}
In 2004 we purchased nine Newport PR50 stepper
motor rotation stages with built-in optical encoders and three
Newport ESP300 GPIB-enabled three-axis stepper motor drivers.
The motors were rated to an accuracy of $0.1^\circ$ with a
precision of $0.01^\circ$.  The reality was somewhat different as
can be seen from the plot in figure
\ref{fig:bad_wp} which shows the measured transmission
through parallel polarizers with a quarter-waveplate being
rotated between them through $360^\circ$ according to the motor encoder.
Clearly at the end of travel the waveplate is not at the same
angle that it was at the beginning of travel as can be seen by
comparing it to the expected sine wave.  This particular encoder
loses $5^\circ$ degrees in $360^\circ$.  Of the nine motors we had, four were
accurate to within manufacturer specifications while the others
lost between $2^\circ$ and $5^\circ$ in going through
$360^\circ$.  This problem was difficult to identify, in large part because errors in waveplate accuracy,
the polarizing beamsplitters, the polarization alignment of the
SPDC and the statistical errors inherent in single-photon counting also contributed to unexpected
results.  Many months of work could most likely have been saved if the motors had
been fully characterized on a simple test rig before being put
into the experiment.  
\begin{figure}[!t]
  \centerline{
    \mbox{\includegraphics[width=\columnwidth]{Figures/bad_wp.eps}}
  }
  \caption{Plot showing the effect of the waveplate losing $5^\circ$ of
  angle over $360^\circ$ of travel.  Note that the intensity at
  $-135^\circ$ differs noticeably from that at $225^\circ$ and the
  best-fit sine curve leads the data for low angles and lags for large
  angles.}
  \label{fig:bad_wp}
  \end{figure}

Once the problem was understood it was possible to correct for
it by only moving the waveplates to known positions from a known
starting position.  This was a bandaid solution that caused
additional grief whenever any part of the polarization
manipulation had to be re-calibrated.  In retrospect, it would clearly
have been better to have returned the bad motors to Newport to be
fixed as soon as the problem was discovered, even at risk of
delaying experimental work.  The necessity of being able to set
a waveplate angle accurately and repeatably to an arbitrary
known position in the work in this dissertation cannot be overstated. 

\subsection{Calibration}
Initial efforts at calibration of waveplate angles relied on the usual
techniques for polarimetry, namely the measurement of the Stokes
parameters by setting the waveplates to four known sets of angles and
measuring the transmission
through crossed polarizers.  This was a bad approach because it
failed to distinguish between errors caused by incorrect
waveplate angles, errors caused by bad polarizers and errors
caused by waveplate inaccuracies.  Although considerable effort
was put into trying to disentangle these different error sources
(along with other possible error sources that were dreamed up),
it ultimately proved futile.  

Significant improvement was achieved by taking a massively
over-complete set of measurements by stepping each waveplate
through $360^\circ$ a few degrees at a time.  This allowed the
problems caused by the inaccurate encoder values to be
distinguished from other problems and made it clear that the
dielectric polarizing beamsplitters were treating left and right
circular polarizations differently.  This in turn allowed the effects of other optical
elements in the system on polarization to be better understood.
The most significant problems were caused by a set of prisms
that rotated the polarization by a few degrees and dielectric
mirrors that caused a relative phase shift of up to $12^\circ$
between the $p$ and $s$ polarizations.  Once these effects were
recognized it was possible to correct for the rotations with
waveplates and for the phase shifts with liquid-crystal
waveplates.  The alternative approach of correcting phase shifts
with thick pieces of quartz tilted away from normal incidence
was found to be impractical because it induced
polarization-dependent losses due to Fresnel reflections and caused
decoherence due to longitudinal walkoff.

In order to calibrate waveplates accurately one needs to either
calibrate out or reduce background light levels.  Coincidence
measurements or the use of an alignment laser with a photodiode allowed
essentially background-free measurements to be made.  When
calibration was done using SPDC singles rates, the background had to 
be separately measured and subtracted out. 

The final calibration method is given below.  It assumes that
SPDC has already been found and that alignment beams are set up
to follow the same optical path as the collected SPDC.
 
\begin{itemize}
\item {\bf Align polarizing beamsplitters with the table and the
  incident beam:} This is done by sending in a beam parallel
  with the table and making sure that surface of the PBS
  retroflects it and that the beam out the 
  reflected port is also parallel to the table.  
\item{\bf Align a plate polarizer with the PBS} This is done by
  measuring transmission through the plate polarizer
  followed by the PBS.  For crossed alignment, minimize the
  transmitted port light, for parallel alignment the reflected
  port light.  We will assume that the plate polarizer is aligned
  parallel to the PBS of the remainder of the calibration.
\item{\bf Insert a rotation stage with a half-waveplate between
  the two polarizers and rotate through $360^\circ$:}  Dirt, scratches and poor
  centering will generally result in a trace with no symmetry,
  one lobe of the sine wave being smaller than the other three,
  for instance.  A half-waveplate with a phase delay less
  than or greater than $180^\circ$ results in non-zero
  transmission even when the waveplate is $45^\circ$ to the
  polarizer axis.  More generally, a waveplate with a phase
  delay of $\phi$ will result in a transmission through parallel
  polarizers as a function of waveplate angle $\theta$ given by
\begin{equation}
T(\theta,\phi)=\frac{1}{4}\left(2 \cos^2 \frac{\phi}{2}- 2
\cos^2 2\theta-\cos \theta \cos 4\theta+1\right)
\label{eq:waveplate_equation}
\end{equation}
Thus if one can trust the rotation stage to accurately set the
waveplate angle, one can simply use a one-parameter fit to the
transmission to obtain the true waveplate phase delay.  This
angle can be recorded and modeled in future polarization
analysis.  Using this technique, the FocTek half-waveplates in
the lab have been measured to have phase delays that differ from
$180^\circ$ by up to $0.2^\circ$.

\item{\bf Insert a rotation stage with a quarter-waveplate
  between the two polarizers} The quarter-waveplate is important
  because it tests the linearity of the incident light and final
  polarizer.  Again, one can plot transmission through the apparatus
  over the $360^\circ$ rotation of the waveplate.  If both the
  polarizers project onto a linear basis, then the sinusoidal
  transmission will only have one frequency component, even if
  the quarter-waveplate is imperfect.  If, either the input
  polarizer or the output polarizer project onto an elliptical
  basis, then there will be an additional frequency component
  present in the transmission with twice the period of the main
  component (i.e. with $180^\circ$ symmetry rather than $90^\circ$
  symmetry).  If the polarizers are both linear
  then only a
  single frequency component will be present, and the resulting
  curve can be fit to equation \ref{eq:waveplate_equation}.  The
  FocTek quarter-waveplates in the lab were measured to have
  phase delays differing from $90^\circ$ by up to $3^\circ$.
  Again, by keeping track of the measured phase delay,
  tomography procedures can be adjusted to account for it.
\item{\bf Move the first polarizer and waveplate back through
  the setup one optical element at a time, noting any phase
  shifts in the transmission curves}  This will detect
  polarization rotations due to mirrors, beamsplitters, prisms
  and other elements.  Pure rotations can be corrected with
  half-waveplates until the phase difference in the transmission
  curves is eliminated.  To measure pure phase shifts, both the
  parallel polarizers should be rotated by $45^\circ$ and the
  quarter-waveplate rotated between the optical element being
  tested and the final polarizer.  Assuming a true quarter-waveplate, if the optical element induces a phase shift $\phi$
  and the quarter-waveplate rotation angle is $\theta$, the
  transmission will be 
\begin{equation}
T(\theta,\phi)=\frac{1}{2}+\frac{1}{4}\cos\phi -
\frac{1}{8}\cos\left(\phi-4\theta\right)-
\frac{1}{8}\cos\left(\phi+4\theta\right)-\frac{1}{4}\sin\left(\phi-2\theta\right)-\frac{1}{4}\sin\left(\phi+2\theta\right)
\end{equation}
The phase shift $\phi$ can be found by fitting to this curve.

\item{\bf When all the optical elements have been corrected for,
  adjust the angle of the crystal to minimize SPDC transmission
  through the detector PBS:} This should get the SPDC
  polarization aligned with the polarizer axes.  Since the SPDC
  is polarized orthogonal to the crystal axis, it is very linear and
  will align well with the polarizer axis.  A
  problem may occur if the two crystals aren't at exactly
  $90^\circ$.  For the two crystals currently in the lab this
  misalignment appears to be $0.3^\circ$, which means that only
  one of the two crystals can be aligned perfectly with the
  polarizer axis.  The misalignment of the other crystal is
  negligible for most purposes.
\end{itemize}

\subsection{Liquid crystal waveplates}
For some experiments it was necessary to have the ability
to switch quickly between two polarizations.  Ideally this would have
been done with an electro-optic modulator (EOM), but due to the
high price of these devices we opted to use liquid crystal
waveplates (LCWPs) instead.  LCWPs are much slower than EOMs,
having a response on the order of milliseconds rather than
nanoseconds, but are easy to drive and can be purchased for under
\$1000.  They are a useful addition to the polarization control
toolbox because, unlike waveplates, they have a variable phase
delay.  This makes them ideal for correcting phase delays that
accumulate in the system as well as providing reasonably fast
switching between different polarization states.  A downside is
that the delay they impart varies non-linearly with the applied
voltage, making calibration and interpolation more involved.  In
addition, when placed between parallel calcite polarizers at $45^\circ$
the LCWPs we tested could not achieve the same isolation as
half-waveplates, showing a minimum transmission of $0.6\%$.
Additionally, the LCWPs displayed some longterm
drift in phase-shift at a given applied voltage.  This is likely
due to the $0.4\%/^\circ$ C temperature dependence of the
crystals\cite{Meadowlark_catalogue} and the lack of temperature control
in the lab. 

We have purchased five LCWPs from BolderVision
Optik and four LCWPs from Meadowlark.  Devices from both
companies had similar performance in terms of retardation, but
the mechanical construction of the Meadowlark devices was much
better.  Most notably, the BolderVision LCWPs induce a beam
deflection of $2^\circ$, presumably due to poor alignment or
manufacture of the optical flats.  The Meadowlark devices do not
do so.  Practically this meant that inserting a BolderVision
device in the input arm of an interferometer generally destroyed
the interference whereas doing the same with a Meadowlark device
did not.  

Liquid crystal waveplates are based on nematic liquid crystals.
These are anisotropic polymers for which it is energetically
favourable for each molecule to align with its neighbours.
Liquid crystal material can be sandwiched between two optical
flats structured in such a way as to induce alignment of the
crystals along a particular direction.  When a voltage is
applied across the two flats, the liquid crystal axis tilts to
align with the field and the birefringent phase shift induced on
light by the crystal is reduced.  Even at very large applied
voltages there remains a residual phase shift which can be
compensated by attaching a fixed retarder to the LCWP.  In this
way commercial LCWPs can achieve a variable retardance that
ranges, depending on the unit, from a little
less than $270^\circ$ to nearly $360^\circ$ at our operating wavelength of $810$ nm.  The liquid
crystals can be damaged by electromigration resulting from the
application of a DC electric field.  For this reason the LCWPs
needed to be driven with an AC field.  An AC field 
defines an alignment direction for the crystals, but without
inducing electromigration.

Figure \ref{fig:LCWP_data} shows some typical coincidence data when an
LCWP with its axis at $45^\circ$ is used to rotate the pump
polarization while one polarization of SPDC is collected.  The plot shows the pronouncedly non-linear response of the waveplates phase
delay with applied voltage.  The LWCPs also show some
small amount of
hysteresis which was corrected for by always making transition
from low voltages to high voltages.  

\begin{figure}[!t]
  \centerline{
    \mbox{\includegraphics[width=\columnwidth]{Figures/LCWP.eps}}
  }
  \caption{Plot showing the coincidence rate of horizontally
    polarized SPDC photons while a voltage was
    applied to a LCWP with its axis at $45^\circ$ to the pump polarization.  The $x$-axis gives the peak-to-peak
    voltage of a 2KHz square wave applied across the liquid crystal.}
  \label{fig:LCWP_data}
  \end{figure}

\subsection{Liquid crystal waveplate driver}
Following the recommendations of the engineer at BolderVision,
I designed a driver for the LCWPs that produced a 2KHz AC square
wave with a variable peak-to-peak voltage of between 0 and 30V.
The voltage was referenced to a 10-bit MAX5201 digital-to-analog
converter with a
temperature-controlled voltage-reference.  The reference voltage was
modulated with a MAX4614 analog switch and increased to the desired
level with an OPA445 high-voltage op-amp.  The whole unit was
controlled over USB through a Delcom 802600 microcontroller.  The
unit could control the voltage applied to four different LCWPs.
Three units were built, and although one of them has two output
ports that never worked, the three units have been in continual
operation for two years without any issues.  The schematic and PCB
layout for the driver can be found in the appendices. 

The response time of the LCWPs is faster when going from a
low to a high voltage where the applied field drives the motion
of the crystal molecules than when going from high to low
voltage when the crystal must relax thermally into its
equilibrium state.  Figure \ref{fig:LCWP_timing} shows the
measured transmission through crossed polarizers with a
BolderVision waveplate between them at $45^\circ$ to the
polarizer axes.  The $20$\% to
$80$\% rise time in response to an applied voltage was $800$ $\mu$s whereas
for the removal of the applied voltage it was $13$ ms.

\begin{figure}[!t]
  \centerline{
    \mbox{\includegraphics[angle=270,width=\columnwidth]{Figures/LCWP_timing.ps}}
  }
  \caption{Rise and fall times for LCWP waveplate transmission when a
    step voltage is applied and removed.  The fall time is longer
    because the liquid crystal must relax back to the equilibrium
    state.}
  \label{fig:LCWP_timing}
  \end{figure}


\section{Polarization analysis}
We have now described all the elements that go into a
polarization analysis system.  The standard approach to
measuring the polarization of light is to put a quarter and
half-waveplate in front of a polarizer or polarizing beamsplitter.
Table \ref{tab:polarization_analysis_settings_4} shows the usual waveplate settings followed
by the quantum projector implemented by detection of a photon at
the transmission path of the beamsplitter:
\begin{table}
\begin{center}
\begin{tabular}{l|l|l}
QWP&HWP&Projector\\\hline
$0^\circ$ & $0^\circ$ & $\ket{H}\bra{H}$\\
$0^\circ$ & $45^\circ$ & $\ket{V}\bra{V}$\\
$45^\circ$ & $22.5^\circ$ & $\ket{D}\bra{D}$\\
$45^\circ$ & $-22.5^\circ$ & $\ket{A}\bra{A}$\\
$0^\circ$ & $22.5^\circ$ & $\ket{R}\bra{R}$\\
$0^\circ$ & $-22.5^\circ$ & $\ket{L}\bra{L}$
\end{tabular}
\caption{Settings for polarization analysis with waveplates}
\label{tab:polarization_analysis_settings_4}
\end{center}
\end{table}
The single-qubit density matrix\cite{MikeandIke} or, equivalently, the Wolf
coherence matrix\cite{MandelandWolf} can be constructed directly
from these values.  In fact, because these
measurements form a set of mutually unbiased bases (see chapter
4), the density matrix can be written directly in terms of their
expectation values as
\begin{align*}
\rho= \bra{H} \rho \ket{H}\cdot \ket{H}\bra{H} & + \bra{V} \rho
\ket{V}\cdot\ket{V}\bra{V}+ \\
&\bra{D} \rho \ket{D}\cdot \ket{D}\bra{D}+
\bra{A} \rho \ket{A}\cdot\ket{A}\bra{A}+ \\
&\bra{R} \rho
\ket{R}\cdot\ket{R}\bra{R}+ \bra{L} \rho \ket{L}\cdot\ket{L}\bra{L}-\mathbb{I}_2\\
\end{align*}
where $\mathbb{I}_2$ is the two-by-two identity matrix.

It is easy to extend this polarimetry scheme to two-photon polarization
states following the method of James \emph{et al.}\cite{James2001}.  The
measured quantity becomes the rate of coincidence counts between
different polarizations for the two photons.  This can be
measured as the rate of coincident detection of photons passing
through two independently set single-photon polarization
analyzers.  The two-qubit density matrix has fifteen independent
elements plus an overall normalization, so it is sufficient to
measure 16 linearly-independent coincidence rates obtained by
setting the two analyzers to sixteen specific pairwise
combinations of the settings in Table
\ref{tab:polarization_analysis_settings_4}.  Later it was pointed out by
Altepeter \emph{et al.}
\cite{Altepeter2005} that better results could be obtained if
all 36 pairwise combinations of measurements in Table
\ref{tab:polarization_analysis_settings_4} were used.  Later
tomography work in the group took advantage of this insight.  

In the type-II apparatus, waveplates were used in the same
manner, but the interpretation of the results was more
complicated.  This will be explained in detail in Chapter 3.

Some tomography work was also done using the liquid crystal
waveplates.  Two LCWPs were placed in front of a polarizing
beamsplitter with their axes at $22.5^\circ$ and $45^\circ$ in
the order LCWP at $22.5^\circ$, LCWP at $45^\circ$, polarizer.
The phase delays at the two LCWPs were set to the values in table \ref{tab:polarization_analysis_settings}.
\begin{table}
\begin{center}
\begin{tabular}{l|l|l}
LCWP at $22.5^\circ$ & LCWP at $45^\circ$ & Projector\\\hline
$0^\circ$ & $0^\circ$ & $\ket{H}\bra{H}$\\
$0^\circ$ & $180^\circ$ & $\ket{V}\bra{V}$\\
$180^\circ$ & $0^\circ$ & $\ket{D}\bra{D}$\\
$180^\circ$ & $180^\circ$ & $\ket{A}\bra{A}$\\
$0^\circ$ & $90^\circ$ & $\ket{R}\bra{R}$\\
$0^\circ$ & $270^\circ$ & $\ket{L}\bra{L}$\\
\end{tabular}
\caption{Settings for polarization analysis with LCWPs} 
\label{tab:polarization_analysis_settings}
\end{center}
\end{table}

\section{Two-photon interference}
Since Hong, Ou and Mandel first demonstrated it in 1987,
two-photon interference or Hong-Ou-Mandel (HOM) interference has become
a vital experimental tool in
all branches of quantum optics.  It has been used for
applications as diverse as measuring tunneling
times\cite{Steinberg1993}, compensating dispersion in white-light
interferometry \cite{Steinberg1992}, teleportation\cite{Bouwmeester1997},
implementing quantum logic gates\cite{OBrien2004}, and making GHZ
states\cite{Pan2000}.  To the cynic it can often seem that all
of experimental linear quantum optics consists of two curves,
the sine-wave and the HOM dip.  In this section we will discuss
the alignment of the Hong-Ou-Mandel interferometer, the effects
of experimental imperfections and some unpublished results
obtained over many hours spent trying to make the HOM behave as
it was supposed to.  

\subsection{Alignment}
The Hong-Ou-Mandel effect is obtained when two time and
frequency-correlated beams of photons are incident on a
beamsplitter in such a way that the outputs match perfectly in
direction, position and time.  Setting up such an interferometer in
free space requires precision alignment
of beams with low photon fluxes, making it an experimentally
non-trivial task.  A major advantage fiber-coupled sources over
the sources described in this chapter is that the well-defined
spatial modes of a fiber-system make aligning Hong-Ou-Mandel
interferometers much easier.  
\subsubsection{Iris scanning}
The trick to aligning the HOM is to accurately find and overlap the
two beams that are reflected and transmitted from the two input
ports of the beamsplitter.  For this, two irises are
required in one output arm, one as close to the beamsplitter as
possible, and the
other as close to the detector as possible.  If the beams
overlap at both the irises then they are said to be mode-matched.  The
beamsplitter is mounted on a translation stage and on a
kinematic mount with two tilt axes.  The two beams can be
overlapped at the beamsplitter by translating the stage to the
point where they cross and at the detector by changing the angle
on the kinematic mount.  

The SPDC at the two input ports can be labeled using
polarization.  If one input port sends in H light and the other
V light then when the irises are scanned across the beam the H
detector will give the profile of one input port and the V
detector of the other.  The position and angle of the
beamsplitter can be adjusted iteratively to achieve overlap of
50 $\mu$m or better between the two beams which is the requirement to
see two-photon interference in the current setup.  

Overlap in the vertical direction is trickier.  While the iris
near the beamsplitter can still be used to judge positional
overlap, there is no independent degree of freedom that can be
used to adjust it.  The only way is to change the angle of a
mirror well before the beamsplitter and then adjust the
beamsplitter angle to compensate.  Even once this is done,
difficulties can arise since tilting the beamsplitter vertically
just makes each side look at different positions on the SPDC
cone without changing the SPDC rate. In practice it was
found that the vertical alignment was best achieved by carefully
aligning an alignment beam to be parallel to the table and to the pump
beam and using this as
a reference for setting the height of one of the detector
irises.  The other iris could then be scanned vertically to maximize
coincidences.  By carefully matching the alignment beams at the
two detector irises, HOM interference will usually be seen if the
other degrees of freedom are sufficiently well-matched.

In order to interfere, the beams must also overlap longitudinally to
within less than
30 $\mu$m.  This was achieved by balancing the path lengths of
alignment beams before and after the crystal using white light
interferometry.  While the coherence length of the alignment laser is too
long for the presence of interference to be a useful guide to
longitudinal overlap, if the laser is turned down below
threshold, it becomes essentially no different than an LED and
the coherence length decreases to 100 $\mu$m or so.  The two
alignment beam paths can be made equal before the crystal by
imaging the crystal plane onto a webcam with a lens and
observing the standing-wave interference.  As the laser is
turned down below threshold the standing-wave interference will
only persist if the pre-crystal path-lengths are equal to within
100 $\mu$m.  Similarly, a detector placed at the output of the
beamsplitter will record Mach-Zehnder interference whenever the
interferometer is disturbed by, say, tapping a mirror.  This
interference will dissappear when the laser is turned below
threshold unless the path lengths from the crystal to the beamsplitter are
equal to within 100$\mu$m.

With position, direction and time all overlapped one can scan
any one of these degrees of freedom to observe a characteristic
dip in the coincidence rate when photons of the same
polarization are sent into the beamsplitter.  In our experiments
we scanned the longitudinal degree of freedom by translating a
retroreflecting prism on a New Focus picomotor-powered translation stage.

While not fool-proof, this alignment procedure succeeded most of
the time in generating HOM interference.  Figure
\ref{fig:hom_scan} shows the result of a typical HOM
scan.  Once the HOM dip is located, the same degrees of freedom
can be fine-tuned to maximize the visibility.  The highest
visibility achieved was $94\%$ as compared to the maximum
achievable visibility of $96\%$ given our beamsplitter splitting
ratio.  The most likely explanation for this $2\%$ discrepancy
is that in going through different optics the two input beams
acquire different spatial profiles and do not interfere
perfectly.  It will be interesting to see if the discrepancy
disappears when the system is upgraded to single-mode fiber.

\begin{figure}[!t]
  \centerline{
    \mbox{\includegraphics[width=\columnwidth]{Figures/sample_dip.eps}}
  }
  \caption{Measured rate of coincidences as the delay in one input arm
    of the HOM interferometer is scanned}
  \label{fig:hom_scan}
  \end{figure}

\subsection{Dips and antidips}
Often, after coarse-aligning for Hong-Ou-Mandel interference, a
scan of delay in one of the input arms resulted in a peak
rather than a dip, even when the two photons had the same
polarization.  An example of this is shown in figure
\ref{fig:antidip}.  This can be explained by considering the
multi-mode nature of the collection system.  A misalignment of the
beamsplitter in the vertical direction will couple SPDC at a different
angle for the transmission-transmission path than for the reflection-reflection
path.  This slight angle mismatch of ~1 mrad between the two
coincidence amplitudes results in additional phase being acquired for
the reflection-reflection path.  The
additional phase alters the usual $\pi$ phase difference between
transmission-transmission and reflection-reflection events to
some other value that depends on the angle of the beamsplitter.
Figure \ref{fig:antidip_visibility} plots the visibility of the HOM
dip as a function of iris position.  It trends towards a sinusoidal
oscillation.  This is the same phenomenon found in a misaligned
Mach-Zehnder interferometer where a small angular error away from
optimal alignment gives rise to a standing-wave interference pattern
consisting of sinusoidal fringes.  Here, though, the interference is
only visible in the coincidence rate between the two detectors and not
in the singles rate.

Unfortunately, after discovering this new interference effect, we learned
that it had been published by another group a few months
earlier\cite{Kim2006}.  
\begin{figure}[!t]
\begin{center}
\subfigure[A Hong-Ou-Mandel antidip with a visibility of
55\%]{\includegraphics[scale=1]{Figures/antidip.eps}\label{fig:antidip}}
\subfigure[Visibility of the Hong-Ou-Mandel antidip as a function of
vertical iris position.  the sinusoidal variation is analogous to the
standing-wave interference seen in a misaligned Mach-Zehnder
interferometer]{\includegraphics[scale=1]{Figures/antidip_vis.eps}\label{fig:antidip_visibility}}
\end{center}
\end{figure}

\subsection{Crystal walkoff} 
Photons produced in the two SPDC crystals exhibited a HOM dip
minimum for slightly different relative path length
differences.  The observed 20 $\mu$m dip displacement can be explained by
considering that the two photons generated in the early
crystal are extraordinarily polarized in the second crystal.
Since in traveling through this second crystal they make
different angles with the optical axis, the two photons will see
different indices of refraction and hence different group
velocities.  Thus while two SPDC photons from the second crystal can be
expected to emerge at the same time, and hence have an HOM dip
at zero path length difference, two SPDC photons from the
first crystal will be offset, and will exhibit a
corresponding shift in the path-length difference for maximum dip
visibility.  The calculated relative group delay difference for
the two photons from the first crystal due to the
angle-dependent refractive index in the second crystal was 17
$\mu$m, in reasonable agreement with experiment.  Once it was understood, this mismatch in dip
locations was easily corrected with a 1.5 mm thick piece of
quartz chosen to compensate the delay between the two photons
from the first crystal.

Another effect was that the location of the beamsplitter for which the
dip visibility was maximized differed by approximately 150 $\mu$m for
the two crystals.  Figure \ref{fig:dips_displaced_plot} shows the result of
scanning the beamsplitter location on dip visibility, while
\ref{fig:dips_displaced_diagram} shows the explanation.  Since
the SPDC from the first crystal will walk off horizontally in
the second crystal, we expect an effective displacement between the two
sources.  When parallel rays emanating from these sources are
traced out it becomes apparent that they meet at horizontally
displaced locations which is exactly what we observe.  This
provides an interesting method of measuring the walkoff in the
crystal, but presents an impediment to using two-photon
interference as a polarization filter for states that are superpositions
of the output of both crystals.  As a compromise, the
beamsplitter was usually located to the point halfway between
the two visibility maxima.  At that point the Hong-Ou-Mandel visibility 
was equal to 86\% for both the crystals.

\begin{figure}
\centering
\mbox{\includegraphics[width=0.5
      \columnwidth,scale=0.5]{Figures/vis_vs_pos.eps}}
\label{fig:dips_displaced_plot}
\caption{Plot of HOM visibility versus beamsplitter
    location}
\end{figure}

\begin{figure}
\centering
\mbox{\includegraphics[scale=0.5]{Figures/dips_displaced_diagram.eps}}
\label{fig:dips_displaced_diagram}
\caption{Diagram explaining the dependence of HOM visibility on
beamsplitter location.  The horizontal photons walk off in the
crystal that generates vertical photons.  This leads to a
displacement in the location of the minimum dip visibility as
the beamsplitter is scanned horizontally.}
\end{figure}

\subsection{Type-II apparatus}
Two-photon interference also occurs when two photons of
different polarizations are put into the same spatio-temporal
mode.  For example, if a horizontal and a vertical photon are
put into the same spatio-temporal mode, the state is described
by 
\begin{equation}
\ket{\psi}=a^\dagger_H a^\dagger_V \ket{vac}.
\end{equation}
When that same state is measured in the diagonal basis it is
convenient to rewrite it as 
\begin{align}
\ket{\psi}&=\frac{1}{2}\left(a^\dagger_D+a^\dagger_A\right)\left(a^\dagger_D-a^\dagger_A\right)
\ket{vac}\\
&=\frac{1}{2}\left(a^\dagger_D a^\dagger_D -
a^\dagger_D a^\dagger_A + a^\dagger_A a^\dagger_D - a^\dagger_A a^\dagger_A\right) \ket{vac}\\
&=\frac{1}{2}\left(a^\dagger_D a^\dagger_D  - a^\dagger_A a^\dagger_A\right) \ket{vac}.
\end{align}
The interference that occurs between the two $a^\dagger_A
a^\dagger_D$ terms is exactly analogous to the interference that
occurs between the reflected and transmitted amplitudes in the
Hong-Ou-Mandel effect.  It is even possible to generate a dip by
delaying one of the photons relative to the other.  If the
photons are temporally distinguishable then the interference
does not occur, and so the rate of coincidences between $A$ and
$D$ is determined by classical probability theory, whereas
when the two photons are spatio-temporally indistinguishable the
two-photon interference eliminates coincidences between $A$ and
$D$.  The dip visibility on the type-II apparatus was $98\pm 1\%$.  Most
likely it was higher than on the type-I apparatus because of an
effectively perfect 50/50 splitting ratio for the 'beamsplitter'
(really just a polarizer in the $45^\circ$ basis) and better
indistinguishability due to spatial filtering by the single-mode
fiber.  Further discussion of this phenomenon is left to
Chapter 3 where its relationship to distinguishability and
symmetry of the photon state is discussed in detail.

\section{Summary}
We have presented the main experimental methods used in measuring the
quantum state of light using tomography and in performing 
experimental work discussed in this thesis.  These
methods allow two-photon SPDC states to be generated and collected,
allow their polarizations to be manipulated and allow the observation
of two-photon interference.  They will be used throughout the remainder of this thesis.



